Decoder input does not come from Encoder (only hidden states are transferred) why ?

Hi !

Thanks for this great code it has been very useful to me. Just I had a question about your encoder decoder model in lstm_encoder_decoder.py file. During the training, you initialize the decoder parameters like this:

decoder_input = input_batch[-1, :, :] decoder_hidden = encoder_hidden decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden)

Meaning that the hidden states are transferred from the encoder to the decoder but not the input tensor. Hence, if the reduced encoded version of the encoder is ignored and not passed to the decoder, I don't understand why using an encoder-decoder architecture. Correct me if I'm wrong, but I think that in the above example, the current model does what a standard LSTM would do ( i.e. transferring hidden states from one cell to another). So from what I understand, for an auto-encoder, the above lines of code would be replaced by:

decoder_input = encoder_input[-1, :, :] decoder_hidden = encoder_hidden decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden)

Again, I'm still learning to better understand auto-encoders so if you could explain me why you coded it this way I'll be happy to hear about it.

Thanks for the code, Cyril

lkulowski / LSTM_encoder_decoder

Decoder input does not come from Encoder (only hidden states are transferred) why ? #4