tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.39k stars 1.96k forks source link

Expected state to be instance of AttentionWrapperState. Received type <class 'tuple'> instead. #205

Closed ngcabang closed 6 years ago

ngcabang commented 6 years ago

Hi, I'm new to TensorFlow and trying to write my own seq2seq based on the tutorial provided. But unfortunately I am in the the error stated above.

This is the part of the code

`attention_states = tf.transpose(encoder_outputs, [1, 0, 2]) attention_mechanism = tf.contrib.seq2seq.LuongAttention(300, encoder_outputs, memory_sequence_length=source_lengths)

decoder_cell = tf.contrib.seq2seq.AttentionWrapper(decoder_cell, attention_mechanism, attention_layer_size=300)

projection_layer = tf.layers.Dense(dec_vocab_size, use_bias=False) dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, target) helper = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, target_lengths, time_major=True) decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=helper, initial_state=encoder_state, output_layer=projection_layer) max_decoder_length = tf.reduce_max(target_lengths)

decoderoutputs, , _= tf.contrib.seq2seq.dynamic_decode(decoder=decoder, maximum_iterations=max_decoder_length)`

The trace error points on this:

decoder_outputs, _ , _= tf.contrib.seq2seq.dynamic_decode(decoder=decoder, maximum_iterations=max_decoder_length)

hugddygff commented 6 years ago

do you solve this problem?

ngcabang commented 6 years ago

Yes, by setting the initial state in this line

decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=helper, initial_state=encoder_state, output_layer=projection_layer)

with

initial_state = decoder_cell.zero_state(dtype=tf.float32, batch_size=batch_size)

I don't know if it is the right way to fix it, but it worked for me.

oahziur commented 6 years ago

@ngcabang @jintaojintao

For the attention decoder cell's initial state, you can create a zero state and then copy the encoder's cell state.

Here is an example how NMT handles this. Alternatively, you can use zero state and fully rely on attention mechanism for decoding as ngcabang's solution.

PanXiebit commented 6 years ago

@oahziur execuse me, but when the when eager execution is enabled. How can I clone the encoder's cell state to the attention decode cell?

when I use decoder_initial_state = cell.zero_state(batch_size, dtype).clone(cell_state=encoder_state) I get the error: Tensor.op is meaningless when eager execution is enabled.