lkulowski / LSTM_encoder_decoder

Build a LSTM encoder-decoder using PyTorch to make sequence-to-sequence prediction for time series data
MIT License
366 stars 84 forks source link

Decoder input does not come from Encoder (only hidden states are transferred) why ? #4

Closed cmergny closed 3 years ago

cmergny commented 3 years ago

Hi !

Thanks for this great code it has been very useful to me. Just I had a question about your encoder decoder model in lstm_encoder_decoder.py file. During the training, you initialize the decoder parameters like this:

decoder_input = input_batch[-1, :, :] decoder_hidden = encoder_hidden decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden)

Meaning that the hidden states are transferred from the encoder to the decoder but not the input tensor. Hence, if the reduced encoded version of the encoder is ignored and not passed to the decoder, I don't understand why using an encoder-decoder architecture. Correct me if I'm wrong, but I think that in the above example, the current model does what a standard LSTM would do ( i.e. transferring hidden states from one cell to another). So from what I understand, for an auto-encoder, the above lines of code would be replaced by:

decoder_input = encoder_input[-1, :, :] decoder_hidden = encoder_hidden decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden)

Again, I'm still learning to better understand auto-encoders so if you could explain me why you coded it this way I'll be happy to hear about it.

Thanks for the code, Cyril