OpenNMT / OpenNMT-tf

Neural machine translation and sequence learning using TensorFlow
https://opennmt.net/
MIT License
1.45k stars 390 forks source link

NMT, What if we do not pass input for decoder? #494

Closed guotong1988 closed 5 years ago

guotong1988 commented 5 years ago

For transformer-based neural machine translation (NMT), take English-Chinese for example, we pass English for encoder and use decoder input(Chinese) attend to encoder output, then final output.

What if we do not pass input for decoder and consider it as a 'memory' model for translation. Is it possible and what will happen?

It seems decoder could be removed and there only exist encoder.

Thank you!

guillaumekln commented 5 years ago

Could you clarify what objective function you want to optimize?

guotong1988 commented 5 years ago

Thank you! If the decoder is removed, I think the objective function is like cross_entropy(target, MLP(encoder_output))

guillaumekln commented 5 years ago

That would work for sequence tagging but not for sequence to sequence tasks (such as NMT). It assumes that the source and target have the same length and are monotonically aligned.

guotong1988 commented 5 years ago

Thank you very much.

guotong1988 commented 5 years ago

Could I do translation task like text generation? See: https://github.com/salesforce/ctrl/blob/master/generation.py https://einstein.ai/presentations/ctrl.pdf

guotong1988 commented 5 years ago

See https://datascience.stackexchange.com/questions/60258/nmt-what-if-we-do-not-pass-input-for-decoder

guillaumekln commented 5 years ago

Could I do translation task like text generation?

You can train a GPT-2 model for example. Read the documentation if interested.

guotong1988 commented 5 years ago

Thank you