facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
29.85k stars 6.32k forks source link

M2M 100 training/fine tuning #3047

Open bmtm opened 3 years ago

bmtm commented 3 years ago

hi there

the M2M 100 readme mentions that the 1.2B parameter model was trained with fairseq's multilingual translation task, without further elaboration. I'm wondering if there could be a example training command with all the parameters for training/fine tuning?

I've tried fine-tuning the model via fairseq-train --arch multilingual_transformer --task multilingual_translation

but it seems that that the training script expects an ensemble of models instead of one, and errors out on an assertion in multilingual_transformer.py

jaspock commented 3 years ago

See this thread.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!