gordicaleksa / Open-NLLB

Effort to open-source NLLB checkpoints.
MIT License
419 stars 37 forks source link

Understand how to do 4-stage curriculum learning from the paper #12

Open gordicaleksa opened 1 year ago

gordicaleksa commented 1 year ago

It's not quite clear how do we setup a 4-stage curriculum learning training going through the codebase & existing documentation.

Understanding this will be super important once we start running on a bigger number of languages.

There is some mention of it in this README.

Write a report and share the learnings on how do do this.

Note: we could always do this manually by stopping and restarting 4 different jobs, but that's error-prone and I suspect Meta had a more streamlined approach. :)