AI4Bharat / IndicLID

Language Identification for Indian languages
11 stars 4 forks source link

Incorrect predection/detection of the given input #4

Open shrimad-mishra-cognoai opened 7 months ago

shrimad-mishra-cognoai commented 7 months ago

Hi, I have been exploring the model but encountered some issues:-

If you provide: Hi kya hal hai aapka then it sometimes predicts it as pan_lat, snd_lat and eng_lat.

If you provide: Hi how are you then it sometimes predicts it as o, snd_lat and eng_lat, ori_Latn, and other.

Can you please provide any solution for this? I think this is a bit biased towards the Latin languages.

chandan-wiai commented 6 months ago

Yes, I observed the same when I ran the example colab notebook with different examples of Hindi written in Latin. @yashmadhani97 Are we doing something incorrect?