Element-Research / rnn

Recurrent Neural Network library for Torch7's nn
BSD 3-Clause "New" or "Revised" License
939 stars 313 forks source link

Strange error with AbstractRecurrent #346

Open AlbertXiebnu opened 7 years ago

AlbertXiebnu commented 7 years ago

I am trying to implement such a hierarchical lstm structure for document classification. sss As implementation, the input of the network will be batchsize x sentNum x seqlen. For example batchsize = 64 , sentNum = 50 , sequel = 15, which means each doc at most 50 sentences, and each sentence has 15 words at most. I use right padding 0 when sentence has not 15 words , as well as right padding when doc has not enough sentences. here is my implementation:

require 'rnn'
require 'torch'

local han = torch.class("han")

-- InputSize: sentence sequence. [batchSize x senSeqLen]
-- OutputSize: sentence embeding. [batchSize x rnnSize ]
function han.senembed(opt)
    local senembed = nn.Sequential() -- shape: batchsize x SeqLen
    senembed:add(nn.Transpose({1,2})) -- shape: SeqLen x batchsize
    senembed:add(nn.LookupTableMaskZero(opt.vocabSize,opt.hiddenSize)) -- shape: SeqLen x batchsize x hiddenSize
    local senLSTM = nn.FastLSTM(opt.hiddenSize, opt.rnnSize):maskZero(1)
    senembed:add(nn.Sequencer(senLSTM)) -- shape: SeqLen x batchsize x rnnSize
    senembed:add(nn.MaskZero(nn.Mean(1),2)) -- shape: batchsize x rnnSize
    return senembed
end

-- input : batchsize x senNum x seqLen
-- output: batchsize x 2
function han:build(opt)
    local han = nn.Sequential()
    local senembed = self.senembed(opt)
    han:add(nn.Transpose({1,2})) -- swap: batchsize and senNum
    han:add(nn.Sequencer(senembed)) 
    local docLSTM = nn.FastLSTM(opt.rnnSize,opt.docSize):maskZero(1)
    han:add(nn.Sequencer(docLSTM))
    han:add(nn.MaskZero(nn.Mean(1),2))
    han:add(nn.Linear(opt.docSize,2))
    han:add(nn.LogSoftMax())
    return han
end

Here is the problem, when I feed the model with shape of batchsize x senNum x seqlen tensor as input , the forward() seems ok but when execute model:backward() occurs a strange error,

x , y = trainset:sample(64) -- x: 64 x senNum x seqlen
output = model:forward(x)
loss = criterion:forward(output,y)
df = criterion:backward(output,y)
model:backward(x,df)

here is the error output from terminal.

/Users/xie/torch/install/share/lua/5.1/nn/Container.lua:67:
In 2 module of nn.Sequential:
...ie/torch/install/share/lua/5.1/rnn/AbstractRecurrent.lua:208: no output for step 50
stack traceback:
        [C]: in function 'assert'
        ...ie/torch/install/share/lua/5.1/rnn/AbstractRecurrent.lua:208: in function 'setOutputStep'
        /Users/xie/torch/install/share/lua/5.1/rnn/Module.lua:37: in function 'setOutputStep'
        /Users/xie/torch/install/share/lua/5.1/rnn/Module.lua:37: in function 'setOutputStep'
        /Users/xie/torch/install/share/lua/5.1/rnn/Recursor.lua:44: in function '_updateGradInput'
        ...ie/torch/install/share/lua/5.1/rnn/AbstractRecurrent.lua:59: in function 'updateGradInput'
        /Users/xie/torch/install/share/lua/5.1/rnn/Sequencer.lua:121: in function 'updateGradInput'
        /Users/xie/torch/install/share/lua/5.1/nn/Module.lua:31: in function </Users/xie/torch/install/share/lua/5.1/nn/Module.lua:29>
        [C]: in function 'xpcall'
        /Users/xie/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
        /Users/xie/torch/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward'
        [string "_RESULT={model:backward(x,dl)}"]:1: in main chunk
        [C]: in function 'xpcall'
        /Users/xie/torch/install/share/lua/5.1/trepl/init.lua:652: in function 'repl'
        .../xie/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x0106d5ed10

I guess the error may related with the MaskZero but I can't figure it out. Can some help me with this problem?

amdcat commented 7 years ago

have you fixed it? I got the same problem like you.