mozilla / firefox-translations-training

Training pipelines for Firefox Translations neural machine translation models
https://mozilla.github.io/firefox-translations-training/
Mozilla Public License 2.0
145 stars 31 forks source link

Figure out the behavior of OpusTrainer augmentation on student distillation gap #773

Open gregtatum opened 1 month ago

gregtatum commented 1 month ago

An experiment for #231

We use OpusTrainer to augment the data during training, both at teacher training, and student training. There is a gap in student training, and it would be good to understand the effects of data augmentation on the training. It's particularly important to compare between the augmented and clean flores.

Language pair: TODO

Experiment Splits

Strategies flores-devtest flores-aug-devtest Training Time
Run augmentation
Disable augmentation
Two stage: aug, no aug
Two stage: no aug, aug

Hypothesis

Augmentation increases student training time. Augmentation behavior may not be learned without augmented data.

gregtatum commented 1 month ago

We'll do #771 and #772 first.