tsroten / dragonmapper

Identification and conversion functions for Chinese text processing
MIT License
54 stars 18 forks source link

Wrong zhuyin to pinyin for syllables ending with ㄨ #16

Closed mthewissen closed 8 years ago

mthewissen commented 8 years ago

That is, when they have the first intonation:

from dragonmapper import hanzi, transcriptions
print transcriptions.zhuyin_syllable_to_pinyin(u'ㄓㄨˋ') # works
print transcriptions.zhuyin_syllable_to_pinyin(u'ㄔㄨ') # does not work

I traced things down to the following:

def _parse_zhuyin_syllable(unparsed_syllable):
    """Return the syllable and tone of a Zhuyin syllable."""
    zhuyin_tone = unparsed_syllable[-1]
    if zhuyin_tone in zhon.zhuyin.characters:
        syllable, tone = unparsed_syllable, '1'
    elif zhuyin_tone in zhon.zhuyin.marks:
        for tone_number, tone_mark in _ZHUYIN_TONES.items():
            if zhuyin_tone == tone_mark:
                syllable, tone = unparsed_syllable[:-1], tone_number
    else:
        raise ValueError("Invalid syllable: %s" % unparsed_syllable)

    return syllable, tone

For some reason, there is no ㄨ in zhon.zhuyin.characters? (also no ㄩ)

tsroten commented 8 years ago

@mthewissen Thanks for reporting this! This is actually an issue with zhon (another one of my libraries). I'm going to close the issue here and open one up over there.

tsroten commented 8 years ago

@mthewissen A new Zhon release is on PyPi that addresses this.

mthewissen commented 8 years ago

I tried it again today and it works. Thanks for fixing it!