mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Improve GPU utilization in student training #783

Open eu9ene opened 3 months ago

eu9ene commented 3 months ago

We noticed that it's only around 30%. It's likely because the model is smaller than the teacher. We can try improving it by increasing the batch size.

Screenshot 2024-07-31 at 1 33 51 PM

In comparison, for teacher training:

Screenshot 2024-07-31 at 1 37 12 PM