AI4Bharat / IndicTrans2

Translation models for 22 scheduled languages of India
https://ai4bharat.iitm.ac.in/indic-trans2
MIT License
214 stars 59 forks source link

Issues for the Urdu and Kashmiri #68

Closed Sab8605 closed 4 months ago

Sab8605 commented 4 months ago

Model gives error or empty string when translating the source language of Arabic script i.e. Urdu and Kashmiri in to any other target language.

PranjalChitale commented 4 months ago

Can you please provide basic details like the model which you are using (base / distilled), whether you are using Fairseq / HF models, also check if you have installed all the requisite dependencies.

We've previously tested the models for languages belonging to the Perso-Arabic script and did not encounter any issue as mentioned above.

Kindly verify if all the dependencies including UrduHack have been properly installed.

Sab8605 commented 4 months ago

I am using distilled model its working fine for english to indic but for endic to english (Kasmiri and urdu) its returning empty line for one ward and when directly using indic to indic on triton server the above mention problem creating issue and thats why eng to indic model returns the error AssertionError: blank lines are not allowed . Also tested on your demo site its giving empty line for one word for urdu and kashmiri to english.