Open songoten28 opened 1 year ago
True, never fully tested nllb, as it’s a nc license.
You can try m2m100 with the multilingualtranslator class. See a example in the unit tests, that should work. I also would accept a Pull Request extending that for Nllb.
Thank you, I confirm m2m100 working on my side
model = MultiLingualTranslatorCT2fromHfHub(
model_name_or_path="michaelfeil/ct2fast-m2m100_1.2B", device="cpu", compute_type="int8",
tokenizer=AutoTokenizer.from_pretrained(f"facebook/m2m100_1.2B", cache_dir=cache_dir)
)
model.generate(
[text, text2],
src_lang=[input_language, input_language],
tgt_lang=[output_language, output_language]
)
but It seem some words not correct, for example: firefly => lửa (vietnamese) "I like to look at the fireflies" => "Tôi thích nhìn những con chim bắn súng" "Tôi thích nhìn vào những con đom đóm" => "I like to look at the frogs"
the correct word is: firefly is đom đóm in vietnamese that why I try to test with another models
Update: I also try NLLB with https://huggingface.co/spaces/Geonmo/nllb-translation-demo , it also got same issue.
"tôi thích nhìn những con đom đóm về đêm" => "I like to watch the night monkeys"
Thank you for creating this amazing library.
But there is a small issue when I follow this link: https://huggingface.co/michaelfeil/ct2fast-nllb-200-3.3B
It gives me the following error:
File "/Users/testpython/venv/lib/python3.11/site-packages/hf_hub_ctranslate2/translate.py", line 126, in init super().init( File "/Users/testpython/venv/lib/python3.11/site-packages/hf_hub_ctranslate2/translate.py", line 48, in init self.model = self.ctranslate_class( ^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Cannot load the target vocabulary from the model directory.
do I need any extra config?