nicolas-ivanov / debug_seq2seq

[unmaintained] Make seq2seq for keras work
233 stars 86 forks source link

Error for different input sequence length and output sequence length #6

Closed jaimita-bansal closed 8 years ago

jaimita-bansal commented 8 years ago

I am getting the following error while prediction. The model predicts fine when the input and output sequence length is equal.

Input sequence length = 16 Output sequence length = 6 Could you help?

Traceback (most recent call last):
  File "/home/jaimita/debug_seq2seq/bin/train.py", line 39, in <module>
    learn()
  File "/home/jaimita/debug_seq2seq/bin/train.py", line 36, in learn
    train_model(nn_model, w2v_model, dialog_lines_for_nn, index_to_token)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/train.py", line 96, in train_model
    log_predictions(test_sentences, nn_model, w2v_model, index_to_token)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/train.py", line 21, in log_predictions
    prediction = predict_sentence(sent, nn_model, w2v_model, index_to_token)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/predict.py", line 47, in predict_sentence
    tokens_sequence = _predict_sequence(input_sequence, nn_model, w2v_model, index_to_token, diversity)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/predict.py", line 34, in _predict_sequence
    predictions = nn_model.predict(X, verbose=0)[0]
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 661, in predict
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 322, in _predict_loop
  File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 384, in __call__
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 963, in rval
    r = p(n, [x[0] for x in i], o)
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 952, in <lambda>
    self, node)
  File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/home/jaimita/.theano/compiledir_Linux-3.19--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:4316)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/home/jaimita/.theano/compiledir_Linux-3.19--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:4193)
ValueError: Input dimension mis-match. (input[0].shape[0] = 6, input[1].shape[0] = 16)
Apply node that caused the error: Elemwise{Add}[(0, 1)](InplaceDimShuffle{1,0,2}.0, InplaceDimShuffle{1,0,2}.0)
Toposort index: 31
Inputs types: [TensorType(float32, 3D), TensorType(float32, 3D)]
Inputs shapes: [(6, 32, 128), (16, 32, 128)]
Inputs strides: [(16384, 512, 4), (512, 8192, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Subtensor{int64:int64:int8}(Elemwise{Add}[(0, 1)].0, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1})]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Apply node that caused the error: forall_inplace,cpu,scan_fn}(TensorConstant{6}, IncSubtensor{InplaceSet;:int64:}.0, IncSubtensor{Set;:int64:}.0, IncSubtensor{InplaceSet;:int64:}.0, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, vector)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, InplaceDimShuffle{1,0,2}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0)
Toposort index: 384
Inputs types: [TensorType(int8, scalar), TensorType(float32, 3D), TensorType(float32, (True, False, False)), TensorType(float32, (True, False, False)), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, vector), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, 3D), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row)]
Inputs shapes: [(), (6, 32, 128), (1, 32, 128), (1, 32, 128), (128, 128), (128, 1), (1,), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (32, 6, 128), (1, 128), (1, 1), (1, 128), (1, 128), (1, 128), (1, 128)]
Inputs strides: [(), (16384, 512, 4), (16384, 512, 4), (16384, 512, 4), (512, 4), (4, 4), (4,), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 16384, 4), (512, 4), (4, 4), (512, 4), (512, 4), (512, 4), (512, 4)]
Inputs values: [array(6, dtype=int8), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', array([ -2.43138842e-14], dtype=float32), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', array([[ -2.43138842e-14]], dtype=float32), 'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[forall_inplace,cpu,scan_fn}(TensorConstant{6}, forall_inplace,cpu,scan_fn}.0, Alloc.0, IncSubtensor{InplaceSet;:int64:}.0, IncSubtensor{Set;:int64:}.0, IncSubtensor{InplaceSet;:int64:}.0, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0)], [], []]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Process finished with exit code 1
nicolas-ivanov commented 8 years ago

@ire-ne your parameters worked fine for me, try pulling the latest commit and rerunning the code. And let me know if it helps.

jaimita-bansal commented 8 years ago

@nicolas-ivanov Hi, I had to replace ANSWER_MAX_TOKEN_LENGTH to INPUT_SEQUENCE_LENGTH in the file predict.py. Now it runs fine. Thanks anyway! :)

nicolas-ivanov commented 8 years ago

Cool!