dmort27 / panphon

Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.
MIT License
212 stars 46 forks source link

Phonological features of "g" don't exist #18

Closed Kitten-Defender closed 4 years ago

Kitten-Defender commented 4 years ago

I spent twenty minutes trying to find the error in my program, and it just turns out I can't parse words with "g" in them. ft.word_fts("g") returns []. I made a program that is literally just print(ft.word_fts(letter)), and every letter I tried but g returns a feature set.

lxkain commented 4 years ago

I can help with this. There is "g" and there is "ɡ". I know they look the same with many fonts but one is a different unicode than the other. The first one is the orthographic g and the second is an IPA symbol.

Kitten-Defender commented 4 years ago

thank you!

Kitten-Defender commented 4 years ago

As a suggestion, I would ask that you include support for the loop "g" character in whatever future releases you make. I've seen it used a bunch in phonetics circles, as a replacement for the ipa /ɡ/. It's naot huge issue, because it just takes like four lines of code to set up a word-around, but I figured you should add it to prevent anyone in the future from getting confused. If anyone's here for the same issue, I'll save you time. > def g_replacer(letter): >> if letter == "g" >>> letter = "ɡ" >> return letter

dmort27 commented 4 years ago

Thank you for your suggestion, @Kitten-Defender. I cannot accept it for two reasons:

  1. I don't want to go down the de facto standard rabbit hole. My goal with PanPhon is to implement the Unicode IPA standards strictly, rather than worrying about deviations.
  2. Adding "g" to the table will break some downstream software that uses PanPhon to validate IPA.

What I can do is add a warning that fires if an input string contains "g", suggesting that they may have actually intended "ɡ".

Also, better code for normalizing "g" away:

def replace_g(s):
   return s.replace('g', 'ɡ')