farizrahman4u / seq2seq

Sequence to Sequence Learning with Keras
GNU General Public License v2.0
3.17k stars 845 forks source link

Masking decoderCell #238

Open borgr opened 6 years ago

borgr commented 6 years ago

Hi, I believe I miss something in masking and it is my mistake, but I can't get masking to work with Attention. I'll state my understanding, in case the flaw it there: masking should ignore padding-zeros in the input (while weights deal with 0 in the output), so it should be useful for this case too.

Minimum not working example: model = Sequential() model.add(Embedding(input_dim=vocab_length, output_dim=200, input_length=longest_input, mask_zero=True)) attention = RecurrentSequential( decode=True, output_length=longest_output) attention.add(Dropout(0.8)) attention.add(AttentionDecoderCell( output_dim=longest_output, hidden_dim=attention_width)) # error attention.add(Dropout(0.8)) model.add(attention)

model.add(RepeatVector(longest_output))

    model.add(TimeDistributed(Dense(vocab_length, activation="softmax")))
    optimizer = optimizers.SGD(
        lr=0.05, decay=1e-6, momentum=0.9, nesterov=True)
    self.model = model
    model.compile(metrics=['accuracy'],
                  optimizer=optimizer, loss="categorical_crossentropy", sample_weight_mode="temporal")

Error:

Traceback (most recent call last): File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/common_shapes.py", line 671, in _call_cpp_shape_fn_impl input_tensors_as_shapes, status) File "/usr/lib64/python3.5/contextlib.py", line 66, in exit next(self.gen) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes must be equal rank, but are 2 and 3 for 'recurrent_sequential_1/while/Select' (op: 'Select') with input shapes: [?,?], [?,62], [?,?,500].

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "biRNN.py", line 496, in main() File "biRNN.py", line 465, in main lstm_widths, kvars[LONGEST_INPUT], kvars[LONGEST_OUTPUT], vocab_length) File "biRNN.py", line 132, in init model.add(attention) File "/cs/labs/oabend/borgr/oneMultiLabelEnv/lib/python3.5/site-packages/keras/models.py", line 489, in add output_tensor = layer(self.outputs[0]) File "/cs/labs/oabend/borgr/oneMultiLabelEnv/lib/python3.5/site-packages/recurrentshop-1.0.0-py3.5.egg/recurrentshop/engine.py", line 488, in call File "/cs/labs/oabend/borgr/oneMultiLabelEnv/lib/python3.5/site-packages/recurrentshop-1.0.0-py3.5.egg/recurrentshop/engine.py", line 594, in call File "/cs/labs/oabend/borgr/oneMultiLabelEnv/lib/python3.5/site-packages/recurrentshop-1.0.0-py3.5.egg/recurrentshop/backend/init.py", line 6, in File "/cs/labs/oabend/borgr/oneMultiLabelEnv/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2646, in rnn swap_memory=True) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop result = context.BuildLoop(cond, body, loop_vars, shape_invariants) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop body_result = body(*packed_vars_for_body) File "/cs/labs/oabend/borgr/oneMultiLabelEnv/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2613, in _step output = tf.where(tiled_mask_t, output, states[0]) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 2328, in where return gen_math_ops._select(condition=condition, t=x, e=y, name=name) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2145, in _select name=name) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2508, in create_op set_shapes_for_outputs(ret) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1873, in set_shapes_for_outputs shapes = shape_func(op) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1823, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "/usr/local/tensorflow/avx-avx2-cpu/1.2.0/python3.5/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Shapes must be equal rank, but are 2 and 3 for 'recurrent_sequential_1/while/Select' (op: 'Select') with input shapes: [?,?], [?,62], [?,?,500].

rrsayao commented 6 years ago

I'm getting the same error. Did you manage to get it fixed?