Closed Latrolage closed 10 months ago
This seems to be the case with all their models which originate from Tatoeba Challenge. Only the models which are included here seem to work using Hugging Face. Up until a month ago, I hadn't encountered such problems.
Thanks for reporting, I'll try to check if the tokenizer or the model is wrong.
Hey! you should use model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-tatoeba-en-ja", revision = "refs/pr/3")
. This is indeed related to an update on the lib, but a fix was opened on all of the models online, like the following: https://huggingface.co/Helsinki-NLP/opus-tatoeba-en-ja/discussions/3
Are the opus-mt-xx-xx models a different issue? I tried just now on both old and newer transformers and haven't gotten them to work. https://huggingface.co/Helsinki-NLP/opus-mt-jap-en?text=%E7%8A%AC%E3%81%8C%E5%A5%BD%E3%81%8D%E3%81%98%E3%82%83%E3%81%AA%E3%81%84
Note that jap is not Japanese
That makes more sense. I also tried the model at https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/jpn-eng opus-2021-02-18 and it seems that my issue there is related to https://github.com/Helsinki-NLP/Tatoeba-Challenge/issues/2#issuecomment-867928524
On the huggingface demo, (e.g. https://huggingface.co/Helsinki-NLP/opus-tatoeba-en-ja?text=My+name+is+Wolfgang+and+I+live+in+Berlin) the output doesn't seem to make sense.
I ran some models locally too and this was the result of:
Output:
only opus-mt-ja-en gave an answer which was understandable at all. Any idea what the problem might be? The opus-mt-jap-en model also doesn't make a comprehensible translation.
The tatoeba models were converted to pytorch through
python -m transformers.models.marian.convert_marian_to_pytorch --src folder --dest folder-pytorch
I'm not sure how just pasting in the huggingface link loads it so I don't know how to replicate it.