dmort27 / epitran

A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
MIT License
649 stars 123 forks source link

Update epihan.py #84

Closed trenslow closed 3 years ago

trenslow commented 3 years ago

Hi @dmort27,

I've been coming across an error in Chinese transliteration whenever I use the ligatures=True feature on tokens containing more than one Chinese character:

Exception has occurred: AttributeError
'map' object has no attribute 'append'

This is due to the fact that the mapping resides within the loop over all tokens and changes the ipa_tokens list to a map object after the first token is processed, before trying to append the ipa for subsequent tokens to that map object.

You can reproduce the error by calling the transliterate function with the ligature feature flag enabled, e.g.

epi.transliterate('高中', ligatures=True)
epi.transliterate('百科全', ligatures=True)

Unless I'm missing something (which is possible because my knowledge of Chinese is very limited), I believe this PR should do the trick to avoid getting the error.