Thank you for sharing this implementation! I noticed that seq2seq.attention_decoder() takes encoder_state as its initial state, but the paper and one of the other implementation (Zhenye-Na/DA-RNN) initialized the decoder with zeros, so what the difference between them and why you implemented in this way?
Hi,
Thank you for sharing this implementation! I noticed that
seq2seq.attention_decoder()
takesencoder_state
as its initial state, but the paper and one of the other implementation (Zhenye-Na/DA-RNN) initialized the decoder with zeros, so what the difference between them and why you implemented in this way?Thank you again for the implementation.
Regards, Ming