PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

https://paddlespeech.readthedocs.io

Apache License 2.0

11.22k stars 1.86k forks source link

🔍 TTS 文本前端问题汇总（Text Frontend Bugs） #2196

Open yt605155624 opened 2 years ago

yt605155624 commented 2 years ago

Please report TTS text frontend bugs here, for examples: text normalization, polyphone and tone sandhi, etc.

We encourage developers to solve these problems.

polyphone: 能说多长(zhang3 ❎)的语音呢？是否可以长(zhang3 ❎)语音合成呢？长(chang2 ✅)语音，长(zhang3 ❎)文本 -> fixed

LiuChiachi commented 2 years ago

教教(jiao1)(jiao)我好不好！读成了(jiao4) -> fixed
哈哈哈(❌)-> fixed

yt605155624 commented 2 years ago

干嘛(❎) -> fixed
你像夏至的分界线，是我终身里最长(❎)的那个白昼！夸夸你！-> fixed
我今天写了两行(❎)代码 -> fixed
媳妇儿（儿化音）-> fixed
小数(❎)点 -> fixed
哈哈哈哈哈哈哈-> fixed
学子(❎无需轻声变调) -> fixed
向阳中学是一所有比较长(❎)的历史的中学校 -> "长"发音 fixed, 所有分词还有问题
咕呱(gu1 ❎)（gua? ✅）-> fixed

yt605155624 commented 2 years ago

https://github.com/PaddlePaddle/PaddleSpeech/issues/2206

BarryKCL commented 2 years ago

you can try g2pw. 睡得着觉？ G2pM: ['shui4', 'de2', 'zhe5', 'jue2', '？'] 睡得着觉？ lazy_pinyin: ['shui4', 'de2', 'zhe', 'jue2', '？'] 睡得着觉？ G2pW: [['shui4', 'de5', 'zhao2', 'jiao4', None]]

小数点 G2pM: ['xiao3', 'shu4', 'dian3'] 小数点 lazy_pinyin: ['xiao3', 'shu3', 'dian3'] 小数点 G2pW: [['xiao3', 'shu4', 'dian3']]

干嘛？ G2pM: ['gan1', 'ma5', '？'] 干嘛？ lazy_pinyin: ['gan4', 'ma', '？'] 干嘛？ G2pW: [['gan4', 'ma2', None]]

我今天写了两行代码 G2pM: ['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 lazy_pinyin: ['wo3', 'jin1', 'tian1', 'xie3', 'le', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 G2pW: [['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'hang2', 'dai4', 'ma3']]

教教我好不好！ G2pM: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '！'] 教教我好不好！ lazy_pinyin: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '！'] 教教我好不好！ G2pW: [['jiao1', 'jiao1', 'wo3', 'hao3', 'bu4', 'hao3', None]]

能说多长的语音呢？是否可以长语音合成呢 G2pM: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de5', 'yu3', 'yin1', 'ne5', '？', 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5'] 能说多长的语音呢？是否可以长语音合成呢 lazy_pinyin: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de', 'yu3', 'yin1', 'ne', '？', 'shi4', 'fou3', 'ke3', 'yi3', 'zhang3', 'yu3', 'yin1', 'he2', 'cheng2', 'ne'] 能说多长的语音呢？是否可以长语音合成呢 G2pW: [['neng2', 'shuo1', 'duo1', 'chang2', 'de5', 'yu3', 'yin1', 'ne5', None, 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5']]

yt605155624 commented 2 years ago

you can try g2pw. 睡得着觉？ G2pM: ['shui4', 'de2', 'zhe5', 'jue2', '？'] 睡得着觉？ lazy_pinyin: ['shui4', 'de2', 'zhe', 'jue2', '？'] 睡得着觉？ G2pW: [['shui4', 'de5', 'zhao2', 'jiao4', None]]

小数点 G2pM: ['xiao3', 'shu4', 'dian3'] 小数点 lazy_pinyin: ['xiao3', 'shu3', 'dian3'] 小数点 G2pW: [['xiao3', 'shu4', 'dian3']]

干嘛？ G2pM: ['gan1', 'ma5', '？'] 干嘛？ lazy_pinyin: ['gan4', 'ma', '？'] 干嘛？ G2pW: [['gan4', 'ma2', None]]

我今天写了两行代码 G2pM: ['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 lazy_pinyin: ['wo3', 'jin1', 'tian1', 'xie3', 'le', 'liang3', 'xing2', 'dai4', 'ma3'] 我今天写了两行代码 G2pW: [['wo3', 'jin1', 'tian1', 'xie3', 'le5', 'liang3', 'hang2', 'dai4', 'ma3']]

教教我好不好！ G2pM: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '！'] 教教我好不好！ lazy_pinyin: ['jiao4', 'jiao4', 'wo3', 'hao3', 'bu4', 'hao3', '！'] 教教我好不好！ G2pW: [['jiao1', 'jiao1', 'wo3', 'hao3', 'bu4', 'hao3', None]]

能说多长的语音呢？是否可以长语音合成呢 G2pM: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de5', 'yu3', 'yin1', 'ne5', '？', 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5'] 能说多长的语音呢？是否可以长语音合成呢 lazy_pinyin: ['neng2', 'shuo1', 'duo1', 'zhang3', 'de', 'yu3', 'yin1', 'ne', '？', 'shi4', 'fou3', 'ke3', 'yi3', 'zhang3', 'yu3', 'yin1', 'he2', 'cheng2', 'ne'] 能说多长的语音呢？是否可以长语音合成呢 G2pW: [['neng2', 'shuo1', 'duo1', 'chang2', 'de5', 'yu3', 'yin1', 'ne5', None, 'shi4', 'fou3', 'ke3', 'yi3', 'chang2', 'yu3', 'yin1', 'he2', 'cheng2', 'ne5']]

we'are very looking forward you to add g2pW into PaddleSpeech TTS through this pr https://github.com/PaddlePaddle/PaddleSpeech/pull/2221

yt605155624 commented 2 years ago

踩一踩一踩 -> fixed

pengzhendong commented 2 years ago

TN：一共有1兆320万5000人 => 一共有一兆三百二十万五零零零人

yt605155624 commented 2 years ago

TN：一共有1兆320万5000人 => 一共有一兆三百二十万五零零零人

现在对于数字，判断其后是否有指定的单位来确定其是数字还是编号，所以把 “人” 加到这里应该可以解决问题，欢迎开发者提交 pr 修复~ https://github.com/PaddlePaddle/PaddleSpeech/blob/7cc1d66863a48b50c2430059c8b84060d84b11a3/paddlespeech/t2s/frontend/zh_normalization/num.py#L31

yt605155624 commented 2 years ago

TN：一共有1兆320万5000人 => 一共有一兆三百二十万五零零零人

现在对于数字，判断其后是否有指定的单位来确定其是数字还是编号，所以把 “人” 加到这里应该可以解决问题，欢迎开发者提交 pr 修复~

https://github.com/PaddlePaddle/PaddleSpeech/blob/7cc1d66863a48b50c2430059c8b84060d84b11a3/paddlespeech/t2s/frontend/zh_normalization/num.py#L31

fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2308

yt605155624 commented 2 years ago

https://github.com/PaddlePaddle/PaddleSpeech/issues/2566

yt605155624 commented 2 years ago

https://github.com/PaddlePaddle/PaddleSpeech/issues/2571

HandsLing commented 2 years ago

“嗯”这个字 lazy_pinyin返回为空

HandsLing commented 2 years ago

@yt605155624

yt605155624 commented 2 years ago

https://github.com/PaddlePaddle/PaddleSpeech/issues/2601 -> fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2629

yt605155624 commented 2 years ago

@HandsLing 这个问题你在 pypinyin 的 issue 搜一下，和版本有关，他们做了不兼容升级，我们 pypinyin 的依赖参考 https://github.com/PaddlePaddle/PaddleSpeech/blob/8ea289a2517aa842ff4c7797f382832cfe13b187/setup.py#L55

yt605155624 commented 2 years ago

https://github.com/PaddlePaddle/PaddleSpeech/issues/2603 -> fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2606

yt605155624 commented 2 years ago

"种点薄荷" 发音有问题

mogosmart commented 2 years ago

model_alias = {

acoustic model

"fastspeech2": "paddlespeech.t2s.models.fastspeech2:**FastSpeech2**",
"fastspeech2_inference": "paddlespeech.t2s.models.fastspeech2:**StyleFastSpeech2Inference**",
# voc
"pwgan":
"paddlespeech.t2s.models.parallel_wavegan:**PWGGenerator**",
"pwgan_inference":
"paddlespeech.t2s.models.parallel_wavegan:**PWGInference**",

} 用的自定义训练声音里的fastspeech2_mix和pwgan_aishell3，上面加粗部分应该怎样改，找不到相关资料，上面代码没有改引入训练的自定义声音后，合成的声音不正常，应该是跟上面的字段有关吗？感觉应该改成对应的字段吧

yt605155624 commented 2 years ago

@mogosmart 没有关系，使用自己训练好的模型可以参考 https://github.com/PaddlePaddle/PaddleSpeech/issues/2225

mogosmart commented 2 years ago

好的这边看一下

yt605155624 commented 1 year ago

https://github.com/PaddlePaddle/PaddleSpeech/issues/2720

yt605155624 commented 1 year ago

噢发音不对，因为在台湾话里面是多音字，被错误预测了 -> fixed by https://github.com/PaddlePaddle/PaddleSpeech/pull/2831

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

QinlongHuang commented 1 year ago

a small bug?

text = '全国一共有112所211大学..'
from paddlespeech.t2s.frontend.zh_frontend import Frontend as zhFrontend
fe = zhFrontend()
print(sum(fe.get_phonemes(raw_text), []))

# Outputs:
[全国一共有一百一十二所二幺幺大学..] not in g2pW dict,use g2pM
['j', 'ie2', 'k', 'e4', 'sp', 'n', 'i3', 'zh', 'iii1', 'd', 'ao4', 'm', 'a5', 'sp', 'q', 'van2', 'g', 'uo2', 'i2', 'g', 'ong4', 'iou3', 'i1', 'b', 'ai3',
 'i1', 'sh', 'iii2', 'er4', 's', 'uo3', 'er4', 'iao1', 'iao1', 'd', 'a4', 'x', 've2', '..', '..']

There are two '..' in the results.

zhuqn commented 7 months ago

)

请问怎么实现这个修复呢