The input of decoder seems not right

psubnwell commented 6 years ago

In your model.py,

        # decoder input
        input_embedding = self.word_dropout(input_embedding)
        packed_input = rnn_utils.pack_padded_sequence(input_embedding, sorted_lengths.data.tolist(), batch_first=True)

the input of decoder (i.e. input_embedding) is the same as the input of encoder, that seems not correct. According to the cited paper, the input of encoder is ['RNNs', 'work'] the input of decoder is ['<EOS>', 'RNNs', 'work'] the output of decoder is ['RNNs', 'work', '<EOS>'] So I think the input of decoder should has one token earlier than the input of encoder...

timbmg commented 6 years ago

When I create the input in ptb.py I add the <sos> token at the beginning of the sequence, and a <eos> at the end of the target sequence.

As you observed correctly, both the encoder and the decoder receive during training the same input in model.py. I.e. something like this <sos> hello world and the corresponding target looks like this hello world <eos>. Like this the shift between input and target is guaranteed.

So the difference to the paper is that the encoder gets additionally a <sos> as the first input, and the decoder gets a <sos> instead of a <eos>, which actually makes more sense to me since we want to start generating a new sequence.

psubnwell commented 6 years ago

Ok I understand. Because I run your code using my own data loader rather than your ptb.py. I didn't notice this difference. Sorry!

timbmg / Sentence-VAE

The input of decoder seems not right #3