mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Update Flores-101 data importer to the Flores-200 dataset #622

Closed gregtatum closed 2 months ago

gregtatum commented 5 months ago

There are more languages in the 200 dataset, but I'm not sure how pressing it is to update.

eu9ene commented 5 months ago

I think the data for the first 100 languages is the same, so it's not pressing.

eu9ene commented 2 months ago

Let's keep our main evals dataset the same, also flores 200 is the same data for the 100 languages in flores-100