Open happylittlecat2333 opened 1 year ago
Hi @happylittlecat2333,
I should have used v1.50 of espeak. I used cmn
for Mandarin and yue
for Cantonese.
Some example outputs
from espeakng import ESpeakNG
esng = ESpeakNG(voice='cmn')
ipa = esng.g2p('妳現在,好漂亮', ipa=1)
print(ipa)
ipa = esng.g2p('宋朝末年年间定居。粉岭围', ipa=1)
print(ipa)
ipa = esng.g2p('宋朝。末年年间定居粉岭围', ipa=1)
print(ipa)
n_ˈi2_ ɕ_ˈiɛ5_n_ ts_ˈai5_x_ˈɑu2_ ph_j_ˈɑu5_ l_ˈiɑ5_ŋ_
s_ˈonɡ5_ ts.h_ˈɑuɜ_ m_ˈo5_ n_ˈiɛɜ_n_ n_ˈiɛɜ_n_ tɕ_ˈiɛ5_n_ t_ˈi5_ŋ_ tɕ_ˈy5_f_ˈəɜ_n_ l_ˈi2_ŋ_ w_ˈeiɜ_
s_ˈonɡ5_ ts.h_ˈɑuɜ_m_ˈo5_ n_ˈiɛɜ_n_ n_ˈiɛɜ_n_ tɕ_ˈiɛ5_n_ t_ˈi5_ŋ_ tɕ_ˈy5_ f_ˈəɜ_n_ l_ˈi2_ŋ_ w_ˈeiɜ_
Thanks a lot @xuqiantong - that's super useful!
Hey Quintong, @alexeib, @michaelauli,
Thanks for your information. I changed the espeak version to v1.50, but I found a significant bug in Chinese tone for pronunciation. The problem is also described in espeak-ng issues. The IPA transcription result seems wrong for the languages with tone changes(e.g., Chinese mandarin).
Some example outputs
from espeakng import ESpeakNG
esng = ESpeakNG(voice='cmn')
ipa = esng.g2p('镜子', ipa=1) # "jing4 zi5" for pinyin
print(ipa)
ipa = esng.g2p('经过', ipa=1) # "jing1 guo4" for pinyin
print(ipa)
ipa = esng.g2p('妈 麻 马 骂', ipa=1) # "ma1 ma2 ma3 ma4" for pinyin
print(ipa)
ipa = esng.g2p('jing1 jing2 jing3 jing4', ipa=1)
print(ipa)
tɕ_ˈi5_ŋ_ ts_i̪1_
tɕ_ˈi5_ŋ_ k_ˈuo5_
m_ˈɑ5_ m_ˈɑɜ_ m_ˈɑ2_ m_ˈɑ5_
tɕ_ˈi5_ŋ_ tɕ_ˈiɜ_ŋ_ tɕ_ˈi2_ŋ_ tɕ_ˈi5_ŋ_
You can see that the result for the second character '麻' introduced a new vowel ɜ(Unicode 025C). And the tone changes are represented as 5, none, 2, 5 for each character respectively. For "jing1" and "jing4" espeak converted them to the same pronunciation "tɕ_ˈi5ŋ". According to Wikipedia, this doesn't seem right, so I checked other representations.
If we do not use IPA, the tones are correct
ipa = esng.g2p('妈 麻 马 骂')
print(ipa)
ipa = esng.g2p('ma1 ma2 ma3 ma4')
print(ipa)
m'A55_| m'A35_| m'A21_| m'A51_|
m'A55_| m'A35_| m'A21_| m'A51_|
Here, the system correctly identified the same vowel [A] for all characters and accurately distinguished tone changes. So, I think the problem is the conversion script for IPA transcription. I also test espeak version v1.51 and found that the bugs still remain. So do you have any suggestions or advices for this bug?
Thanks a lot!
❓ Questions
Hey Quintong, @alexeib, @michaelauli,
Thanks a lot for open-sourcing the model weights of your recent paper Simple and Effective Zero-shot Cross-lingual Phoneme Recognition!s
Since Espeak-ng has split Chinese support
cmn
tocmn
andcmn-latn-pinyin
to support pinyin inputs, however in the latest version 1.51cmn
has the bug to predict the tone (which treats the tone as number), like the case for "你好" ("ni3 hao3" for pinyin) in version 1.51:and the result is below:
Furthermore, I test the
XLSR-53
model to predict IPA for Chinese speech, the test case in Chinese text is宝马配跛骡鞍,貂蝉怨枕董翁榻。
and the result is:and I use the Espeak-ng v1.51 to convert the text back to IPA is this (after text clean for
_
andˈ
):which is not consistent with the prediction. So I believe the reason is that the Espeak version I used is not the same in your implementation. Could you tell me the Espeak version that you used and the command or script for Chinese?
Thanks a lot!