dmort27 / epitran

A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
MIT License
630 stars 121 forks source link

Punjabi 2 diacritic symbols separated from associated character #91

Closed emilyahn closed 2 years ago

emilyahn commented 2 years ago

Hello!

I was noticing that for 2 Punjabi diacritics, they are not combined with their associated symbol. Others work fine, as in example 3:

>>> import epitran
>>> epi = epitran.Epitran('pan-Guru')

# ex 1: nasal should be on the ɑ
>>> epi.trans_list("ਸ਼ਾਂਤ")
['ʃ', 'ɑ', 'ਂ', 't̪']

# ex 2: k should be doubled
>>> epi.trans_list("ਬਲੈਂਕ")
['b', 'ə', 'l', 'ɛ', 'ਂ', 'k']

# ex 3: nasalization works properly
>>> epi.trans_list("ਬਲਵਿੰਦਰ")
['b', 'ə', 'l', 'ʋ', 'ɪ̃', 'd̪', 'ə', 'ɾ']
dmort27 commented 2 years ago

Interesting. I'll try to fix this tomorrow.

dmort27 commented 2 years ago

I believe this is no fixed in the repo. I'll make a release this evening when some fixes for French are complete.