Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models
MIT License
323 stars 40 forks source link

Model not available on huggingface model page, how do I use it with huggingface. #70

Open mt-empty opened 2 years ago

mt-empty commented 2 years ago

Hi I want to use or fine tune this model https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-syr. But, I couldn't find it on huggingface model page. I tried this:

from transformers import MarianTokenizer
import sentencepiece
model_checkpoint = "Helsinki-NLP/opus-mt-en-syr"
tokenizer = MarianTokenizer.from_pretrained(model_checkpoint, return_tensors="pt")

Any links or instructions?

Thank you

jorgtied commented 2 years ago

Not all models have been converted to huggingface. Sorry for that. You could try to convert yourself or you could use the original MarianNMT model. Note that this model might not work very well as the training data is small and the test is on Bible data only. That might be very much overfitted. I don't really know.

mt-empty commented 2 years ago

Thanks for the quick reply, I don't mind mind learning how to covert it even if it takes weeks. I also plan to hopefully improve it by adding some data.

I had a glimpse at the tutorials and the documentations here, but I don't think they are related to huggingface.

Any general guidelines or resources on how to do it?

zlzhang1617 commented 2 years ago

I wanna know how to convert origin model to huggingface MarianMTModel by myself?

mt-empty commented 2 years ago

I decided to build one from scratch and upload it to huggingface, here is my code https://github.com/mt-empty/assyrian-translation-model I hope this helps.