bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.15k stars 163 forks source link

windows and linux have different result on chinese #140

Closed TheHonestBob closed 1 year ago

TheHonestBob commented 1 year ago

Describe the bug when I use same code to Convert Chinese to ipa,but windows and linux have different result,english is not.

Phonemizer version The output of phonemize --version from command line, very helpfull! phonemizer-3.2.1 available backends: espeak-ng-1.49.2, segments-2.2.1 uninstalled backends: espeak-mbrola, festival System Linux version 5.4.0-122-generic (buildd@lcy02-amd64-095) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #138-Ubuntu SMP Wed Jun 22 15:00:31 UTC 2022 python 3.7

To reproduce from phonemizer import phonemize phone = phonemize('卡尔普陪外孙玩儿滑梯', language='cmn', backend='espeak', strip=True, preserve_punctuation=True, with_stress=True, language_switch='remove-flags') winodws result(look right): kˈiə ˈər pˈuː pˈeɪ wˈai sˈʌn wˈan ˈər hjˈuːa tˈaɪ linux result: khˈɑsa5n ˈərsa5n phˈusa5n phˈeiər5 wˈaisi̪5 sˈunji5 wˈɑnər5 ˈərər5 xwˈɑər5 thˈiji5

mmmaat commented 1 year ago

Hi, this an issue related to espeak itself, not phonemizer. Are you sure you have the same version of espeak installed on both systems?

Actually I have (on Linux):

$ phonemize --version
phonemizer-3.2.1
available backends: espeak-ng-1.50, espeak-mbrola, festival-2.5.0, segments-2.1.3

And your sample gives another result:

$ echo '卡尔普陪外孙玩儿滑梯' | phonemize -l cmn -b espeak --strip --preserve-punctuation --with-stress
[WARNING] words count mismatch on 100.0% of the lines (1/1)
khˈɑ2 ˈər2 phˈu2 phˈeiɜ wˈai5 sˈuə5n wˈɑɜn ˈərɜ xwˈɑɜ thˈi5
TheHonestBob commented 1 year ago

Hi, this an issue related to espeak itself, not phonemizer. Are you sure you have the same version of espeak installed on both systems?

Actually I have (on Linux):

$ phonemize --version
phonemizer-3.2.1
available backends: espeak-ng-1.50, espeak-mbrola, festival-2.5.0, segments-2.1.3

And your sample gives another result:

$ echo '卡尔普陪外孙玩儿滑梯' | phonemize -l cmn -b espeak --strip --preserve-punctuation --with-stress
[WARNING] words count mismatch on 100.0% of the lines (1/1)
khˈɑ2 ˈər2 phˈu2 phˈeiɜ wˈai5 sˈuə5n wˈɑɜn ˈərɜ xwˈɑɜ thˈi5

thank's for your reply, windows is 1.52, linux is 1.49, I can't install espeak>=1.5, can you help me, thank you again

mmmaat commented 1 year ago

You need espeak-ng, not espeak : https://bootphon.github.io/phonemizer/install.html#on-debian-unbuntu sudo apt-get install espeak-ng should be enough.

TheHonestBob commented 1 year ago

You need espeak-ng, not espeak : https://bootphon.github.io/phonemizer/install.html#on-debian-unbuntu sudo apt-get install espeak-ng should be enough.

thanks a lot, I get it