Open wchunming opened 7 years ago
Also, if I understand the Seq2Seq implementation correctly, the peeky model does not drag the context vector along as additional, time-constant (repeated) input, like [2] and as described in the docstring, but instead adds it to the readout (previous prediction) pointwise.
@farizrahman4u can you please confirm/confute this, and elaborate on readout
, decode
and teacher_force
, especially how they play together? (I did read all of docs/ carefully, but I still don't get how exactly these work.)
[1] Sequence to Sequence Learning with Neural Networks (http://arxiv.org/abs/1409.3215) [2] Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (http://arxiv.org/abs/1406.1078)
Both [1] and [2] don't use readout for their encoder, but seq2seq's implementation seems to read output of previous timestep (please refer to line 168 of master/seq2seq/models.py). Do I understand it correctly?