Does NMT invert the encoder sequence?

tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial

Apache License 2.0

6.39k stars 1.96k forks source link

Does NMT invert the encoder sequence? #231

Closed denisb411 closed 6 years ago

denisb411 commented 6 years ago

I read on some articles that inverting the input text words is a commom task on seq2seq encoder-decoder networks, but I can't see nmt model doing this. Is this task done on code? Does this technique really improves the network's accuracy?

An example of where I saw this: Hands-On Machine Learning with Scikit-Learn and TensorFlow By Aurélien Géron - Page 407

(please notify me if prints a page of this book is an illegal activity)

oahziur commented 6 years ago

We used to have an option for reverse the source sequence, see this change.

We usually use a Bi-Directional RNN encoder in the NMT, so reverse the source sentence doesn't help in this case. I think it can help if you are using an uni-directional RNN encoder for enfr or fren translation.

nbro commented 6 years ago

@oahziur Why doesn't it help to reverse the order of the words in the input sequence if you use a bi-directional encoder?

denisb411 commented 6 years ago

Hello @nbro, it's because the bidirectional rnn generate outputs and states for both orders (normal sequence and inverted sequence), so it's the double of information extracted than an unidirectional rnn. @oahziur Isn't valid an implementation of inverting the sentence order when using unidirectional encoders, added as a hparam? Edit: just checked the change. Ignore this.

nbro commented 6 years ago

@denisb411 It doesn't mean reversing the order wouldn't still help. It would be nice to have some tests.

denisb411 commented 6 years ago

@nbro technically yes, because the inversion would make no difference as both orders are being computed. Surely there's some paper proving its improvement.

lmthang commented 6 years ago

Hi, since the introduction of attention mechanism and usages of biRNN, inversing the source sentence becomes less important. It also makes the code less clean and more complicated, so I removed it. Feel free to add it back in your code. So close for now.