Add comparisons with other open source models

marco-c commented 1 year ago

In our evaluation code, we currently compare our student models against models from other vendors. We should add open source models to the comparison.

For languages that we don't already have and for languages that we already have where our quality is lower, we could consider using their training configuration, or even using them as teacher models.

libretranslate.com (https://github.com/argosopentech/argos-translate)
Opus-MT (https://github.com/UKPLab/EasyNMT#Opus-MT)
NNLB (https://github.com/thammegowda/nllb-serve)

https://github.com/alvations/lightyear/tree/main/lightyear/translators also has source to use Opus-MT and NNLB.

NNLB also offers already translated test sentences, so we wouldn't even need to run the models: https://github.com/facebookresearch/fairseq/blob/nllb/examples/nllb/evaluation/README.md.

We should start with the Helsinki ones, given #117.

We also have https://opus.nlpl.eu/dashboard.

neffscape commented 1 year ago

LibreTranslate works really well in my experience (at least when translating from German to Italian).

marco-c commented 1 year ago

This was done in https://github.com/mozilla/firefox-translations-models/pull/96, https://github.com/mozilla/firefox-translations-models/pull/100, and https://github.com/mozilla/firefox-translations-models/pull/105.

@neffscape turns out Argos (i.e. LibreTranslate) is worse than our models on German -> English and on English -> Italian.

mozilla / translations

Add comparisons with other open source models #179