AI4Bharat / IndicTrans2

Translation models for 22 scheduled languages of India
https://ai4bharat.iitm.ac.in/indic-trans2
MIT License
214 stars 59 forks source link

Issues with Inconsistent Translations and Extra punctuations. #92

Closed Sab8605 closed 3 weeks ago

Sab8605 commented 1 month ago

Hi,

I am using the model for live translation of video, implemented as mentioned in the README file. While translating, I have observed some issues:

Inconsistent Translations for Similar Words:

Case 1: "support for promotion of MSMEs." Translation: "एमएसएमई को बढ़ावा देने के लिए समर्थन।" Case 2: "Provide special attention to MSMEs." Translation: "एम. एस. एम. ई. पर विशेष ध्यान देना।" Extra Full Stops in Translations for Certain Languages:

In languages like Tamil, Telugu, Bengali, and Marathi, the model often adds an extra full stop at the end of sentences, even when the sentence is not complete. Examples:

Hindi: "हमारे केरला में," -> Bengali: "আমাদের কেরলে।" Hindi: "इन छात्रों को," -> Bengali: "এই ছাত্ররা।" Hindi: "अब ये लोग," -> Bengali: "এখন, এই লোকেরা।"

Sab8605 commented 1 month ago

Is it possible to generate gender-specific responses from the model? In some instances, a female speaker in the video is speaking, but the model generates translations as if a male is speaking.

Example:

Female Speaker: "I present the budget for 2024-25." Expected Translation: "मैं 2024-25 के लिए बजट प्रस्तुत करती हूँ।" Actual Translation: "मैं 2024-25 के लिए बजट प्रस्तुत करता हूँ।" I have checked by obtaining the top 5 responses from the model, but in some cases, it gives the female version, while in others, it does not. Additionally, the index for the female version response is inconsistent.

prajdabre commented 1 month ago

Two things:

  1. Different inputs will lead to different outputs. These things are not always controllable. It depends on the patterns the model has seen.
  2. The model is not aware of gender, race etc, you will have to modify the decoding algorithm or reorder the outputs.
Sab8605 commented 2 weeks ago

Thank you for your insights. I appreciate your explanation about how different inputs can lead to varied outputs depending on the patterns the model has seen, and how the model isn't inherently aware of factors like gender or race.

Could you please elaborate a bit more on the decoding algorithm and how it can be modified or used to reorder outputs? I'm keen to understand this aspect better.