farizrahman4u / seq2seq

Sequence to Sequence Learning with Keras
GNU General Public License v2.0
3.17k stars 845 forks source link

Shape computing in LSTMCell may lead to strange error #183

Closed Vimos closed 7 years ago

Vimos commented 7 years ago

I may should have posted this to the debug_seq2seq repo. However, when I tried to debug this, I found interesting shape computing problems.

/usr/bin/python2.7 /opt/jetbrains/pycharm-2016.3/helpers/pydev/pydevd.py --multiproc --qt-support --client 127.0.0.1 --port 37962 --file /home/vimos/Public/github/NLP/debug_seq2seq/bin/train.py
warning: Debugger speedups using cython not found. Run '"/usr/bin/python2.7" "/opt/jetbrains/pycharm-2016.3/helpers/pydev/setup_cython.py" build_ext --inplace' to build.
pydev debugger: process 24534 is connecting

Connected to pydev debugger (build 171.4249.47)
INFO:summa.preprocessing.cleaner:'pattern' package found; tag filters are available for English
Using TensorFlow backend.
INFO:lib.dialog_processor:Loading corpus data...
INFO:lib.dialog_processor:/var/lib/try_seq2seq/corpora_processed/movie_lines_cleaned_m1.txt and /var/lib/try_seq2seq/words_index/w_idx_movie_lines_cleaned_m1.txt exist, loading files from disk
INFO:__main__:-----
INFO:lib.w2v_model.w2v:Loading model from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:loading Word2Vec object from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:setting ignored attribute syn0norm to None
INFO:gensim.utils:setting ignored attribute cum_table to None
INFO:gensim.utils:loaded /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:lib.w2v_model.w2v:Model "movie_lines_cleaned_w5_m1_v256.bin" has been loaded.
INFO:__main__:-----
INFO:lib.nn_model.model:Initializing NN model with the following params:
INFO:lib.nn_model.model:Input dimension: 256 (token vector size)
INFO:lib.nn_model.model:Hidden dimension: 512
INFO:lib.nn_model.model:Output dimension: 20001 (token dict size)
INFO:lib.nn_model.model:Input seq length: 16 
INFO:lib.nn_model.model:Output seq length: 6 
INFO:lib.nn_model.model:Batch size: 32
Backend TkAgg is interactive backend. Turning interactive mode on.
Traceback (most recent call last):
  File "/opt/jetbrains/pycharm-2016.3/helpers/pydev/pydevd.py", line 1585, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/opt/jetbrains/pycharm-2016.3/helpers/pydev/pydevd.py", line 1015, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/vimos/Public/github/NLP/debug_seq2seq/bin/train.py", line 37, in <module>
    learn()
  File "/home/vimos/Public/github/NLP/debug_seq2seq/bin/train.py", line 30, in learn
    nn_model = get_nn_model(token_dict_size=len(index_to_token))
  File "/home/vimos/Public/github/NLP/debug_seq2seq/lib/nn_model/model.py", line 29, in get_nn_model
    depth=1
  File "/data/home/vimos/Public/git/github/NLP/seq2seq/seq2seq/models.py", line 89, in SimpleSeq2Seq
    model.add(encoder)
  File "/data/home/vimos/Public/git/github/ml/keras/keras/models.py", line 442, in add
    layer(x)
  File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/engine.py", line 472, in __call__
    self.build(K.int_shape(inputs[0]))
  File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/engine.py", line 1007, in build
    cell.build(K.int_shape(output))
  File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/engine.py", line 110, in build
    self.model = self.build_model(input_shape)
  File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/cells.py", line 172, in build_model
    f = add([x0, r0])
  File "/data/home/vimos/Public/git/github/ml/keras/keras/layers/merge.py", line 455, in add
    return Add(**kwargs)(inputs)
  File "/data/home/vimos/Public/git/github/ml/keras/keras/engine/topology.py", line 571, in __call__
    self.build(input_shapes)
  File "/data/home/vimos/Public/git/github/ml/keras/keras/layers/merge.py", line 84, in build
    output_shape = self._compute_elemwise_op_output_shape(output_shape, shape)
  File "/data/home/vimos/Public/git/github/ml/keras/keras/layers/merge.py", line 55, in _compute_elemwise_op_output_shape
    str(shape1) + ' ' + str(shape2))
ValueError: Operands could not be broadcast together with shapes (4,) (512,)

When we add LSTMCell to the encoder, the input_shape for build_model is

<type 'tuple'>: (None, 256)

While we add LSTMCell to the decoder, the input_shape for build_model is

<type 'tuple'>: (None, 16, 256)

I am not clear which may be connected to this, recurrentshop or seq2seq. I am still working on this and any help is appreciated! Thank you!