dmort27 / epitran

A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
MIT License
630 stars 121 forks source link

Extraneous unicode character in Kazakh #89

Closed emilyahn closed 2 years ago

emilyahn commented 2 years ago

Hi David and team,

When I use the Kazakh package and am transliterating this character: "ү", the output produces spurious \x08 (twice) before the expected "ʏ". Here's an example:

>>> epi.transliterate(u'түзде')
't\x08\x08ʏzdje'
dmort27 commented 2 years ago

Are you using kaz-Cyrl or kaz-Cyrl-bab?

emilyahn commented 2 years ago

Ah, kaz-Cyrl

dmort27 commented 2 years ago

I was able to replicate the error and correct it (locally, at least). I have entered the second stage of coding.

The updated version in in the GitHub repo. I'll make a new PyPI release in the next few days.