Open marco-c opened 2 weeks ago
It looks like it is already in OPUS: https://github.com/Helsinki-NLP/OPUS-ingest/tree/master/corpus/Mozilla-I10n. Though it seems to be a very old version, from 2021.
Localization data from software like I think it can really help with translation of short sentences, specially when #888 is fixed :sweat_smile:
EDIT: although some language pairs may need a little bit of cleaning in these corpora, Ubuntu and OpenOffice corpora can be useful helping firefox translations models with the webpage menus.
https://github.com/mozilla-l10n/mt-training-data
Maybe we could add it to OPUS.