Open raphaelmerx opened 2 years ago
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
bump
❓ Bidirectional translation with different datasets
What is your question?
I'm using the multilingual translation code to generate a bidirectional
tdt-en,en-tdt
model (see https://github.com/pytorch/fairseq/issues/2078). I'd like to augment this model with backtranslated datatdt>en
. Now this data is meant to be used for training theen>tdt
direction only. Is it possible to train a multilingual, bidirectional model, but with different datasets for each direction?Code
Not relevant
What have you tried?
Train a bidirectional model using parallel data only, use it for backtranslation, then train two separate models for each direction using parallel + backtranslated data. But I would prefer keeping one bidirectional model.
What's your environment?
Not relevant