mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Experiment with one stage training #815

Closed eu9ene closed 2 months ago

eu9ene commented 2 months ago

Related to #472

Hypothesis

There is an issue with pre-training on a mix of back-translated data and original parallel corpus and then fine-tuning on the original corpus only. The model does not continue training and stop too early. We can experiment with only training on the mix without fine-tuning.

Results

This approach looks working and the model trains longer, so it was implemented in the pipeline.