Open koren-v opened 3 years ago
You need to preprocess the string first using the provided sentence piece model of the source language. Our models don;t support internal sentence piece segmentation. This needs to be done before piping input to the decoder.
Hi, I've loaded the models from the following directory: https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/ru-en When I tried some of them I often get translation like: "▁Y O O O O O O O O O O O O O O O O O O O O" or "I 'm b@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@ m@@" Then I tried to load the model from the Hugging Face site: but get pretty similar outputs while using Hugging Face framework gives good translations. Probably something wrong with config. I launch it using the Marian library. For example:
So what can be wrong?
Probably I somehow should do preprocessing and postprocessing?