cldf-clts / clts-legacy

Cross-Linguistic Transcription Systems
Apache License 2.0
4 stars 3 forks source link

Add unicode names to data #13

Closed xrotwang closed 7 years ago

xrotwang commented 7 years ago

It's somewhat difficult to recognize what unicode glyphs are used as graphemes in the data files, so (automatically?) adding a description using the unicode names would help.

LinguList commented 7 years ago

Next commit will fix this:

>>> bipa = CLTS()
>>> bipa['t'].uname
 'LATIN SMALL LETTER T'
>>> bipa['kw'].uname
'LATIN SMALL LETTER K / MODIFIER LETTER SMALL W'
>>> bipa['K'].uname
'LATIN CAPITAL LETTER K'

If sound is a valid sound, the normalized form will be used, if not, it is the original character (therefore the second example).