avian2 / unidecode

ASCII transliterations of Unicode text - GitHub mirror
https://pypi.python.org/pypi/Unidecode
GNU General Public License v2.0
516 stars 62 forks source link

Preserve isn't preserving the original word in a multilingual content (utf-8 file). #100

Open mxav1111 opened 2 months ago

mxav1111 commented 2 months ago

Hi there, It seems that when english (with dialects) and hindi words (in proper hindi language) are in the document, it is messing up with hindi words and actually converting hindi words to english words when preserve parameter is used. What it should be doing is only convert those english accented words and leave the rest (preserve). It is properly removing accents from English words as such but then also messes up words written in hindi language.

Hope my understanding of preserve parameter is correct. If not, then I apologize and if you have any suggestions to overcome this situation, please suggest.

Thanks for your help.