jacksonllee / pycantonese

Cantonese Linguistics and NLP
https://pycantonese.org
MIT License
354 stars 38 forks source link

Jyutping2yale #2

Closed mahiuchun closed 9 years ago

mahiuchun commented 9 years ago

I referenced: http://en.wikipedia.org/wiki/Yale_romanization_of_Cantonese http://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/initials.php http://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/finals.php

I know there is also tone marks representation of tones. It would be more complicated to implement and I'm not sure whether that is what you want. Also Jyutping's tone 1 cannot distinguish between high-flat and high-falling.

I tested with:

import pycantonese
pycantonese.yale('gwong2zau1waa2')
pycantonese.yale('jyut6jyu5')
pycantonese.yale('nei5hou2')
mahiuchun commented 9 years ago

Author E-mail changed, close.

jacksonllee commented 9 years ago

Hello, thank you for the code, and sorry for the long silence -- not much time to work on this project except during long holidays when things on the research and teaching are a little quieter. I've just updated the whole library, including your code for Yale conversion. The diacritics for tones and the "h" for low tones are also implemented. Examples:

>>> import pycantonese as pc
>>> pc.yale('m4goi1')
['m̀h', 'gōi']
>>> pc.yale('gwong2dung1waa2')
['gwóng', 'dūng', 'wá']

(either chrome or github doesn't seem to correctly render the Unicode combining grave accent on "m" for Jyutping m4?)

Thanks again!