Closed alexwilson1 closed 3 years ago
This can sadly happen, it is called hallucination. There is sadly no easy fix. I found that this happens more often with the opus Mt model than with the other models, especially when the input is noisy / different from a clean and nice sentence.
Got it, thank you for letting me know! You are correct - it occurs most frequently when the input is malformed (e.g. mix of languages, non-standard punctuation) etc.
Perhaps language detection on a sentence level could help resolve this in some cases, and normalising punctuation (although this will be difficult for all languages). I'll try a couple of things out, but will close the issue for now. Thanks!
Hey team,
Thank you again for the great library.
Today we translated 'id' (Indonesian) sentences and quite a few of them came out as variants of "I'm sorry I'm sorry I'm sorry I'm sorry I'm sorry I'm sorry" even though they did not mention 'sorry' in the text.
Any idea why this could be please? Could it be because I'm not performing sentence splitting prior to translation whilst using the 'translate_sentences' function?
Thanks!