coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
34.5k stars 4.18k forks source link

A better plan for Chinese polyphonic characters #3455

Closed alin995 closed 7 months ago

alin995 commented 9 months ago

G2PW

there are many polyphonic characters in Chinese, and 'pypinyin' does not handle them well.

for example: '盛了一碗米饭' in pinyin should be

cheng2 le5 yi1 wan3 mi3 fan4

but pypinyin.pinyin() outputs

sheng4 le5 yi1 wan3 mi3 fan4

'cheng2' is incorrectly translated as 'sheng4'.

package g2pW performs well in handling this convenience.

this is my code

TTS/tts/layers/xtts/tokenizer.py

from g2pw import G2PWConverter

conv = G2PWConverter(style='pinyin', enable_non_tradional_chinese=True)

def chinese_transliterate(text):
    return "".join(conv(text)[0])
Yaodada12 commented 8 months ago

可以的,大佬

stormcenter commented 8 months ago

关注一下,这个feature的进展

stale[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

OswaldoBornemann commented 5 months ago

But that would cause a multiprocessing problem.