eole-nlp / eole

Open language modeling toolkit based on PyTorch
https://eole-nlp.github.io/eole
MIT License
62 stars 12 forks source link

Conversion from Eole to CTranslate2 #72

Open ArtanisTheOne opened 4 months ago

ArtanisTheOne commented 4 months ago

A lot of the OpenNMT-py ecosystem encourages the use of CTranslate2 downstream for efficient inference. Would really love this to be added to the new eole. Doing some retraining of some custom multilingual NMT models and am using Eole to keep everything as up-to date as possible.

isanvicente commented 3 weeks ago

Hi! Any news on the ctranslate2 converter? I'd be happy to help if needed, I would need some guidance though. Would the opennmt-py-converter differ much for eole?

vince62s commented 3 weeks ago

no should be similar but from safetensors file. Also if you have the will we'll need to add the estimator but I'm not sure @minhthuc2502 did the layer part already

isanvicente commented 2 weeks ago

Hi!

Sorry for taking so long to answer. I've been trying to implement this for the past few days. Code here: https://github.com/isanvicente/CTranslate2/blob/master/python/ctranslate2/converters/eole.py

So far, I've mapped the config and layers of the old ONMT models to the new eole format (starting from _get_model_spec_seq2seq). Conversion is executed properly, but when translating with the model all I get is gibberish. My guess is either I messed up with the layer mapping at some point (decoder layers most probably) or some config parameter is not parsed properly. Could you take a look and see if you can find what I missed? You sure now better what changes were implemented from onmt to eole.

Thanks!

vince62s commented 2 weeks ago

many options have changed. I suggest you first print here https://github.com/isanvicente/CTranslate2/blob/master/python/ctranslate2/converters/eole.py#L208 to check the content of checkpoint[opt] and look at the options

for instance all these https://github.com/isanvicente/CTranslate2/blob/master/python/ctranslate2/converters/eole.py#L24-L29 are set differently now