Closed ngcabang closed 6 years ago
do you solve this problem?
Yes, by setting the initial state in this line
decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=helper, initial_state=encoder_state, output_layer=projection_layer)
with
initial_state = decoder_cell.zero_state(dtype=tf.float32, batch_size=batch_size)
I don't know if it is the right way to fix it, but it worked for me.
@ngcabang @jintaojintao
For the attention decoder cell's initial state, you can create a zero state and then copy the encoder's cell state.
Here is an example how NMT handles this. Alternatively, you can use zero state and fully rely on attention mechanism for decoding as ngcabang's solution.
@oahziur execuse me, but when the when eager execution is enabled. How can I clone the encoder's cell state to the attention decode cell?
when I use decoder_initial_state = cell.zero_state(batch_size, dtype).clone(cell_state=encoder_state)
I get the error: Tensor.op is meaningless when eager execution is enabled.
Hi, I'm new to TensorFlow and trying to write my own seq2seq based on the tutorial provided. But unfortunately I am in the the error stated above.
This is the part of the code
`attention_states = tf.transpose(encoder_outputs, [1, 0, 2]) attention_mechanism = tf.contrib.seq2seq.LuongAttention(300, encoder_outputs, memory_sequence_length=source_lengths)
decoder_cell = tf.contrib.seq2seq.AttentionWrapper(decoder_cell, attention_mechanism, attention_layer_size=300)
projection_layer = tf.layers.Dense(dec_vocab_size, use_bias=False) dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, target) helper = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, target_lengths, time_major=True) decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=helper, initial_state=encoder_state, output_layer=projection_layer) max_decoder_length = tf.reduce_max(target_lengths)
decoderoutputs, , _= tf.contrib.seq2seq.dynamic_decode(decoder=decoder, maximum_iterations=max_decoder_length)`
The trace error points on this:
decoder_outputs, _ , _= tf.contrib.seq2seq.dynamic_decode(decoder=decoder, maximum_iterations=max_decoder_length)