Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models
MIT License
323 stars 40 forks source link

What is tatoeba-langtune? #76

Closed hdeval1 closed 2 years ago

hdeval1 commented 2 years ago

Is this recipe used to tune the tatoeba models that were already trained? I am hoping to provide data to it to tune multilingual tatoeba models but I am not sure where this recipe is pulling data from?

jorgtied commented 2 years ago

langtune is indeed for tuning multilingual models for specific languages. If you have your own data, it may be easier to just prepare it with the subword segmentation and then to run marian with appropriate parameters and that kind of fine-tuning data. It may bot work out of the box to use the langtune recipe in your case.

hdeval1 commented 2 years ago

Awesome, thank you for clarifying! Nothing with technology ever works straight out the box, so I was sure there is some configuring I would have to work through and just wanted to make sure I was on the right track. I just got the finetuning recipes in finetune/ to work so I figured that would be the next step. Thanks again!