Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models
MIT License
323 stars 40 forks source link

Missing bi-directional models for some language pairs #38

Open pentegroom opened 3 years ago

pentegroom commented 3 years ago

Hi, If I am not wrong, I could not find bi-directional models for some language pairs. Can you please advise me on how to find the bi-directional models for the following language pair:

Punjabi - English Hindi - English Telugu - English

Thank you.

jorgtied commented 3 years ago

Could you look at the models from https://github.com/Helsinki-NLP/Tatoeba-Challenge/blob/master/results/tatoeba-models-all.md https://github.com/Helsinki-NLP/Tatoeba-Challenge/blob/master/results/tatoeba-models-all.md Punjabi is still missing I think. I will see if I can add a model.

Best, Jörg

On 24. Nov 2020, at 19.56, pentegroom notifications@github.com wrote:

Hi, If I am not wrong, I could not find bi-directional models for some language pairs. Can you please advise me on how to find the bi-directional models for the following language pair:

Punjabi - English Hindi - English Telugu - English

Thank you.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Helsinki-NLP/OPUS-MT-train/issues/38, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEWCPUGB7MO4VEXTH6HMRLSRPXV5ANCNFSM4UBG66UQ.

pentegroom commented 3 years ago

Thank you so much @jorgtied , I am looking forward to it. By the way, could you please guide me on how to look up the code of the languages officially?

jorgtied commented 3 years ago

For the Tatoeba MT Challenge models I use ISO 639-3 language codes and ISO 639-5 for the language groups in multilingual models. The other OPUS-MT models are typically ISO 639-1 codes but there may be some exceptions. Wikipedia has all the lists but there are also official lists in various places.