Open daquang opened 8 years ago
i see, the default of return_sequence for bidirectional rnn is set to false, to fixed that, i just added model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat'), return_sequences=True) to replace model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat')) https://github.com/hycis/bidirectional_RNN/blob/master/imdb_birnn.py#L72
I believe you still have some errors. In your newest version, you have these lines:
model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat'), return_sequences=True) model.add(BiDirectionLSTM(100, 24, output_mode='sum'), return_sequences=True)
After changing these two lines as follows, the code works as intended: model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat', return_sequences=True)) model.add(BiDirectionLSTM(100, 24, output_mode='sum', return_sequences=True))
yap, you are right, thanks for pointing out.
I am new to bidirectional LSTM, sorry if this is too trivial.
I have some doubt in following lines: --- Stacked up BiDirectionLSTM layers --- model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat', return_sequences=True)) model.add(BiDirectionLSTM(100, 24, output_mode='sum', return_sequences=True))
If this is a stacked LSTM, should not output of 1st layer(50) be equal to input of second layer(100). It would be nice if you can help me in understanding that part.
@shwetgarg Because it's a bidirectional, so there is one output from the forward pass and one output from the backward pass, so we can either 'concat' the outputs which give us twice the vector length or simply 'sum' the outputs which return the same length. So for the first LSTM, I use output_mode='concat', that's why it's double
I get the following error when I run the IMDB example:
Traceback (most recent call last): File "imdb_birnn.py", line 77, in
model.add(BatchNormalization((24 * maxseqlen,)))
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/containers.py", line 40, in add
layer.init_updates()
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/normalization.py", line 38, in init_updates
X = self.get_input(train=True)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input
return self.previous.get_output(train=train)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 296, in get_output
X = self.get_input(train)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input
return self.previous.get_output(train=train)
File "/home/dxquang/bidirectional_RNN/birnn.py", line 187, in get_output
forward = self.get_forward_output(train)
File "/home/dxquang/bidirectional_RNN/birnn.py", line 143, in get_forward_output
X = X.dimshuffle((1,0,2))
File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/var.py", line 341, in dimshuffle
pattern)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 141, in init
(i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.