Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models
MIT License
312 stars 39 forks source link

Data for Brazilian Portuguese #91

Open reletreby opened 1 year ago

reletreby commented 1 year ago

Where can I find the data used to train: https://huggingface.co/Helsinki-NLP/opus-mt-tc-big-en-pt ?

When I use the local make data and specify pob to be a target language, it doesn't do anything. In particular, this location has nothing about pob https://object.pouta.csc.fi/Tatoeba-Challenge-v2021-08-07/

I would like to know how the data for this particular model looks like as I would like to fine-tune it.

reletreby commented 1 year ago

@jorgtied would really appreciate your help!

jorgtied commented 1 year ago

There is Brazilian Portuguese in the eng-por package. You need to look at the language label file in the package to see which instance is Brazilian Portuguese.