avian2 / unidecode

ASCII transliterations of Unicode text - GitHub mirror
https://pypi.python.org/pypi/Unidecode
GNU General Public License v2.0
517 stars 62 forks source link

Improve Hebrew conversion #24

Closed alonbl closed 6 years ago

alonbl commented 6 years ago

Convert double letter translation to capital letter as very hard to understand what the translation is because of duplicate, for example:

kh - is it k and h or kh? tskh - is it t,s,kh or ts,k,h or ts,kh, etc...

0xa2 Hebrew bible puncheation mark, should be ignored.

0xc6 Opposite Nun, same as 'n'.

0xba Hulam Haser, vawel as 'o'.

0xbf Makaf Raphe, same as Makaf (0xbe).

0xc5 Hebrew bible puncheation mark, should be ignored.

0xc7 Makaf katan, vowel as 'o'.

0xd0 Aleph, sounds as AHA must exist to make string readbale. Distinguish from '`' use capital A to distinguish from 'a' vowel.

0xf5 Splitted Vave, same as 'v'.

0xf6 Opposite Nun, same as 'n'.

0xf7 Small Kuf, same as 'q'.

Signed-off-by: Alon Bar-Lev alon.barlev@gmail.com

avian2 commented 6 years ago

Merged. Thanks.