Closed guotong1988 closed 5 years ago
Could you clarify what objective function you want to optimize?
Thank you!
If the decoder is removed, I think the objective function is like
cross_entropy(target, MLP(encoder_output))
That would work for sequence tagging but not for sequence to sequence tasks (such as NMT). It assumes that the source and target have the same length and are monotonically aligned.
Thank you very much.
Could I do translation task like text generation? See: https://github.com/salesforce/ctrl/blob/master/generation.py https://einstein.ai/presentations/ctrl.pdf
Could I do translation task like text generation?
You can train a GPT-2 model for example. Read the documentation if interested.
Thank you
For transformer-based neural machine translation (NMT), take English-Chinese for example, we pass English for encoder and use decoder input(Chinese) attend to encoder output, then final output.
What if we do not pass input for decoder and consider it as a 'memory' model for translation. Is it possible and what will happen?
It seems decoder could be removed and there only exist encoder.
Thank you!