two questions in LuongAttnDecoderRNN

howardyclo / pytorch-seq2seq-example

Fully batched seq2seq example based on practical-pytorch, and more extra features.

76 stars 17 forks source link

two questions in LuongAttnDecoderRNN #1

Closed mirror111 closed 6 years ago

mirror111 commented 6 years ago

In the LuongAttnDecoderRNN 1 when t=0 decoder_hidden is last encoder hidden state (num_layers num_directions, batch_size, hidden_size).but in EncoderRNN, the last hidden state is (num_layers, batch_size, hidden_size num_directions). is it right? 2 There is a line of code is decoder_output, decoder_hidden = decoder.rnn(emb, decoder_hidden) i think it should be decoder_output, decoder_hidden = self.rnn(emb, decoder_hidden)

howardyclo commented 6 years ago

@mirror111 Hello! Thanks for opening the issue:

Correct.
Yes, nice spot!

mirror111 commented 6 years ago

In the function evaluate() 1 i think the code encoder_optim.zero_grad() decoder_optim.zero_grad() is unnecessary. and the function evaluate() don't have these parameters.either. 2 when in the evaluate section, the decoder's input should from the top word from decoder's output, or from the real target?

howardyclo commented 6 years ago

@mirror111 Hello,

You're right, thanks for the spot.
It should be the decoder's output (it should be the same as translation section). Thanks again! But I think it may also be okay if we feed the previous target word to the decoder in current decoding time step, just same as the training mode.

You are welcome to give me a pull request :-)