Element-Research / rnn

Recurrent Neural Network library for Torch7's nn
BSD 3-Clause "New" or "Revised" License
941 stars 313 forks source link

LSTM with variable length sequence problem? #339

Closed AlbertXiebnu closed 8 years ago

AlbertXiebnu commented 8 years ago

I am not familiar with the rnn package in torch . I do some experiments with LSTM module in this package, but I found that the LSTM will fixed the input sequence length when first call forward. Here is the script. lstm = nn.LSTM(64,32) lstm:forward(torch.randn(12,64)) lstm:forward(torch.randn(8,64)

Here I build a lstm with 64 inputsize, 32 outputsize. when I execute the second line code , it gives me a matrix with 12x32, which makes sense. 12 represent the sequence length. However, when I execute the third line of code, gives me a inconsistent tensor size error. Can someone explain it ? Thanks a lot~

nicholas-leonard commented 8 years ago

I recommend you use SeqLSTM instead:

lstm = nn.SeqLSTM(64,32)
lstm:forward(torch.randn(12,1,64))
lstm:forward(torch.randn(8,1,64))

The input has shape seqlen x batchsize x inputsize.

Although much slower, you can achieve the same thing by decorating the nn.LSTM with a nn.Sequencer:

lstm = nn.Sequencer(nn.LSTM(64,32))
lstm:forward(torch.randn(12,1,64))
lstm:forward(torch.randn(8,1,64))

The Sequencer allows the lstm instance to take entire sequence as input. Otherwise, the LSTM only takes one time-step of shape batchsize x inputsize at a time.

AlbertXiebnu commented 8 years ago

Thank you your advices, nicholas. As my understand, when use nn.LSTM module, the first dimension of the 2D input tensor will be the batchsize? is that right?

nicholas-leonard commented 8 years ago

Yes, an LSTM not decorated with a Sequencer will take batchsize x inputsize as input and output batchsize x outputsize. In any case, I recommend you modify your code to use either SeqLSTM(inputsize, outputsize) as it is faster, minimally allocates memory, and such.

AlbertXiebnu commented 8 years ago

Many thanks for your explanation !