Open lars76 opened 4 weeks ago
Here is quick workaround based on https://en.wikipedia.org/wiki/Erhua#Standard_rules
def pinyin_to_ipa_erhua(pinyin):
ipas = list(pinyin_to_ipa(pinyin[:-1]))
suffix_to_ipa = {
"anr": "ɐʵ",
"enr": "ɚ", "inr": "ɚ", "unr": "ɚ",
"angr": "ɑ̃ʵ",
"engr": "ɤ̃ʵ", "ingr": "ɤ̃ʵ",
"iongr": "ʊ̃ʵ", "ongr": "ʊ̃ʵ",
"our": "ou̯˞",
"iur": "ou̯ʵ",
"aor": "ou̯˞",
"iaor": "ɑu̯ʵ",
"eir": "ɚ", "uir": "ɚ",
"air": "ɐʵ",
"ier": "ɛʵ",
"uer": "œʵ",
"er": "ɤʵ",
"or": "ɔʵ",
"ar": "ɐʵ",
"ir": "ɚ",
"ur": "u˞",
"vr": "ɚ"
}
strip_two = ["anr", "enr", "inr", "unr", "angr", "engr", "ingr", "iongr", "ongr"]
new_ipas = []
for ipa in ipas:
ipa = list(ipa)
for k, v in suffix_to_ipa.items():
if pinyin.endswith(k):
if k in strip_two:
ipa = ipa[:-2]
else:
ipa = ipa[:-1]
if pinyin == "jur" or pinyin == "yur":
ipa += ["ɥɚ"]
else:
ipa += [v]
break
new_ipas.append(ipa)
return new_ipas
Hello, thank you for the suggestion and the workaround. Unfortunately, I do not have the capacity to integrate this functionality at the moment.
Hey, could you add support for erhua. Combinations such as 事儿 = shìr are not handled. Even in standard Chinese (news etc.), erhua is often heard.