Using pretrained models for translations

Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models

MIT License

318 stars 40 forks source link

Using pretrained models for translations #26

Closed MickHardins closed 3 years ago

MickHardins commented 3 years ago

Hello, I have a doubt regarding the use of the released pretrained models.

I have a marian server running the opus en-fr model (BPE). I'd like to test the model by translating some sentences of my choice.

According to model documentation, I have to send to the server preprocessed input. The file preprocess.sh usage is:

USAGE preprocess.sh langid bpecodes < input > output

While langid, input, and output are clear to me, I don't understand what should I pass as bpecodes. Can you please point me towards the right direction?

Thanks in advance

jorgtied commented 3 years ago

This should be the bpe-model file that comes with the translation model release.

MickHardins commented 3 years ago

This should be the bpe-model file that comes with the translation model release.

There are several files bundled with the model:

decoder.yml
source.bpe
target.bpe
opus..vocab.yml
source.tcmodel

I tried both target.bpe and source.bpe. My understanding here is that I should use source.bpe since I'm encoding sentences in the source language. However when I send the text to the server using the websocket interfaces the server crashes.

jorgtied commented 3 years ago

Did you check the logfiles of your webserver that runs the service? Maybe some code is not available from the server? The source.bpe should be the model that you need.

MickHardins commented 3 years ago

The problem was a wrong decoder.yml file that was pointing to a different bpe file. Thanks for your help!