I may should have posted this to the debug_seq2seq repo. However, when I tried to debug this, I found interesting shape computing problems.
/usr/bin/python2.7 /opt/jetbrains/pycharm-2016.3/helpers/pydev/pydevd.py --multiproc --qt-support --client 127.0.0.1 --port 37962 --file /home/vimos/Public/github/NLP/debug_seq2seq/bin/train.py
warning: Debugger speedups using cython not found. Run '"/usr/bin/python2.7" "/opt/jetbrains/pycharm-2016.3/helpers/pydev/setup_cython.py" build_ext --inplace' to build.
pydev debugger: process 24534 is connecting
Connected to pydev debugger (build 171.4249.47)
INFO:summa.preprocessing.cleaner:'pattern' package found; tag filters are available for English
Using TensorFlow backend.
INFO:lib.dialog_processor:Loading corpus data...
INFO:lib.dialog_processor:/var/lib/try_seq2seq/corpora_processed/movie_lines_cleaned_m1.txt and /var/lib/try_seq2seq/words_index/w_idx_movie_lines_cleaned_m1.txt exist, loading files from disk
INFO:__main__:-----
INFO:lib.w2v_model.w2v:Loading model from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:loading Word2Vec object from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:setting ignored attribute syn0norm to None
INFO:gensim.utils:setting ignored attribute cum_table to None
INFO:gensim.utils:loaded /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:lib.w2v_model.w2v:Model "movie_lines_cleaned_w5_m1_v256.bin" has been loaded.
INFO:__main__:-----
INFO:lib.nn_model.model:Initializing NN model with the following params:
INFO:lib.nn_model.model:Input dimension: 256 (token vector size)
INFO:lib.nn_model.model:Hidden dimension: 512
INFO:lib.nn_model.model:Output dimension: 20001 (token dict size)
INFO:lib.nn_model.model:Input seq length: 16
INFO:lib.nn_model.model:Output seq length: 6
INFO:lib.nn_model.model:Batch size: 32
Backend TkAgg is interactive backend. Turning interactive mode on.
Traceback (most recent call last):
File "/opt/jetbrains/pycharm-2016.3/helpers/pydev/pydevd.py", line 1585, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/opt/jetbrains/pycharm-2016.3/helpers/pydev/pydevd.py", line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/vimos/Public/github/NLP/debug_seq2seq/bin/train.py", line 37, in <module>
learn()
File "/home/vimos/Public/github/NLP/debug_seq2seq/bin/train.py", line 30, in learn
nn_model = get_nn_model(token_dict_size=len(index_to_token))
File "/home/vimos/Public/github/NLP/debug_seq2seq/lib/nn_model/model.py", line 29, in get_nn_model
depth=1
File "/data/home/vimos/Public/git/github/NLP/seq2seq/seq2seq/models.py", line 89, in SimpleSeq2Seq
model.add(encoder)
File "/data/home/vimos/Public/git/github/ml/keras/keras/models.py", line 442, in add
layer(x)
File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/engine.py", line 472, in __call__
self.build(K.int_shape(inputs[0]))
File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/engine.py", line 1007, in build
cell.build(K.int_shape(output))
File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/engine.py", line 110, in build
self.model = self.build_model(input_shape)
File "/data/home/vimos/Public/git/github/NLP/recurrentshop/recurrentshop/cells.py", line 172, in build_model
f = add([x0, r0])
File "/data/home/vimos/Public/git/github/ml/keras/keras/layers/merge.py", line 455, in add
return Add(**kwargs)(inputs)
File "/data/home/vimos/Public/git/github/ml/keras/keras/engine/topology.py", line 571, in __call__
self.build(input_shapes)
File "/data/home/vimos/Public/git/github/ml/keras/keras/layers/merge.py", line 84, in build
output_shape = self._compute_elemwise_op_output_shape(output_shape, shape)
File "/data/home/vimos/Public/git/github/ml/keras/keras/layers/merge.py", line 55, in _compute_elemwise_op_output_shape
str(shape1) + ' ' + str(shape2))
ValueError: Operands could not be broadcast together with shapes (4,) (512,)
When we add LSTMCell to the encoder, the input_shape for build_model is
<type 'tuple'>: (None, 256)
While we add LSTMCell to the decoder, the input_shape for build_model is
<type 'tuple'>: (None, 16, 256)
I am not clear which may be connected to this, recurrentshop or seq2seq.
I am still working on this and any help is appreciated!
Thank you!
I may should have posted this to the debug_seq2seq repo. However, when I tried to debug this, I found interesting shape computing problems.
When we add LSTMCell to the encoder, the input_shape for build_model is
While we add LSTMCell to the decoder, the input_shape for build_model is
I am not clear which may be connected to this, recurrentshop or seq2seq. I am still working on this and any help is appreciated! Thank you!