avian2 / unidecode

ASCII transliterations of Unicode text - GitHub mirror
https://pypi.python.org/pypi/Unidecode
GNU General Public License v2.0
517 stars 62 forks source link

Support for Latin Extended-D ? #49

Closed PonteIneptique closed 5 years ago

PonteIneptique commented 5 years ago

Hi there, I find myself in a particuliar situation where I deal with manuscripts transcriptions. Of course, some char are not valid in unicode, but some seems to be from my understanding :

   unidecode.x0f1
   unidecode.x0e6
   unidecode.x0f0
ꝛ   unidecode.x0a7
   unidecode.x0f0
   unidecode.x0f0
   unidecode.x0e8
   unidecode.x0f0
   unidecode.x0e5
   unidecode.x0e4
ꝓ   unidecode.x0a7
   unidecode.x0f1
   unidecode.x0f0
   unidecode.x0f7
   unidecode.x0e6
   unidecode.x0e7
   unidecode.x0e5
   unidecode.x0ee
ꝰ   unidecode.x0a7
ꝙ   unidecode.x0a7
ꝯ   unidecode.x0a7
   unidecode.x0f1
   unidecode.x0f1
   unidecode.x0f1
   unidecode.x0f1
ꝑ   unidecode.x0a7
   unidecode.x0e8
   unidecode.x0f1

Should I pull request for x0a7 and x0f7 ?

avian2 commented 5 years ago

Sorry, I don't understand what you are trying to say here.

PonteIneptique commented 5 years ago

Sorry. Apparently, x0a7 and x0f7 spaces are not present in Unidecode, so my question is : is unidecode ok for supporting it ?

avian2 commented 5 years ago

If you have any transliterations to contribute for characters in the U+A7xx block, please send a pull request.

U+F7xx block is in a private use area. I don't think Unidecode can make any assumptions on what those characters represent, so it can't include transliteration for those.