chdzq / ARPAbetAndIPAConvertor

Apache License 2.0
64 stars 14 forks source link

转换错误 ə 转换为 AX,应为 AH #4

Closed Jackiexiao closed 1 year ago

Jackiexiao commented 1 year ago

转换错误: ə 转换为 AX,应为 AH @chdzq
代码

from arpabetandipaconvertor.phoneticarphabet2arpabet import (
    PhoneticAlphabet2ARPAbetConvertor,
    ConvertPriority,
)

_ipa_convertor = PhoneticAlphabet2ARPAbetConvertor()

for ipa_str in ["həˈloʊ", "əˈbaʊt"]:
    print(ipa_str)
    print(_ipa_convertor.convert(ipa_str, priority=ConvertPriority.American))

结果

həˈloʊ
HH AX0 L OW1
əˈbaʊt
AX0 B AW1 T

但是 cmudict 查表 跟 转换结果不同

HELLO 1 HH AH0 L OW1
HELLO 2 HH EH0 L OW1
ABOUT 1 AH0 B AW1 T
Jackiexiao commented 1 year ago

hmmm... 原来 cmudict 中没有 AX 的音素... http://www.speech.cs.cmu.edu/cgi-bin/cmudict

The cmudict does not have /AX/ and /AXR/, 将 AX 替换为 AH0