coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
33.28k stars 4.03k forks source link

[Bug] Does not support mixed pronunciation of Chinese and English #3825

Open coderczp opened 1 month ago

coderczp commented 1 month ago

Describe the bug

Hello everyone, thank you very much for open sourcing coqui ai, such a great TTS that performs much better than similar products in terms of performance and effectiveness. I encountered a problem when using coqui. When I input the following text, coqui only pronounces Chinese characters and ignores English words or letters inside. How can I make coqui support mixed Chinese and English pronunciation? Looking forward to your reply. Thank you again.

"欢迎来到某某公司,for english,please select 1" ===> The English in the text was ignored,no error found

To Reproduce

python3 TTS/server/server.py --model_name tts_models/zh-CN/baker/tacotron2-DDC-GST

Expected behavior

No response

Logs

INFO:werkzeug - - [14/Jul/2024 13:22:12] "GET /favicon.ico HTTP/1.1" 404 -
 > Model input: 欢迎来到某某公司,for english,please select 1。
 > Speaker Idx:
 > Language Idx:
 > Text splitted to sentences.
['欢迎来到某某公司,for english,please select 1。']
xuan1ɨŋ2 lai2daʌ4 mou3mou3 goŋ1sɪ1 , for   english , please   select   i1 。
 [!] Character 'g' not found in the vocabulary. Discarding it.
 > Processing time: 2.1070477962493896
 > Real-time factor: 0.44043307207738364
INFO:werkzeug - - [14/Jul/2024 13:35:23] "GET /api/tts?text=欢迎来到某某公司,for%20english,please%20select%201。&speaker_id=&style_wav={"0":%200.1}&language_id= HTTP/1.1" 200 -

Environment

Here is the version information I used=>
 version: ghcr.io/coqui-ai/tts-cpu
 platfrom: linux
 model: tts_models--zh-CN--baker--tacotron2-DDC-GST

Additional context

No response

Airgods commented 1 month ago

是的,我发现使用V2模型可以解决中英混合的问题,但是单个英文字母无法读出,它会误解为中的汉语拼音

eginhard commented 1 month ago

python3 TTS/server/server.py --model_name tts_models/zh-CN/baker/tacotron2-DDC-GST

You're using a model that only supports Chinese. You'd have to synthesise the English parts with a different model.

Airgods commented 1 month ago

python3 TTS/server/server.py --model_name tts_models/zh-CN/baker/tacotron2-DDC-GST

You're using a model that only supports Chinese. You'd have to synthesise the English parts with a different model.

请问我该如何做呢

Airgods commented 1 month ago

python3 TTS/server/server.py --model_name tts_models/zh-CN/baker/tacotron2-DDC-GST

You're using a model that only supports Chinese. You'd have to synthesise the English parts with a different model.

请问我该如何做呢

我看了很多,并没有找到如何选择这种模式,我选择v2的模型,设置在中文下,可以说出英文,但是针对单个英文字母,比如 a b c d,模型无法识别

ksyyk commented 1 month ago

自己做分词,区别出中英文文本的位置,分别调用对应的模型。

Airgods commented 1 month ago

自己做分词,区别出中英文文本的位置,分别调用对应的模型。

好的,谢谢,我尝试换模型了,这样感觉有些麻烦

coderczp commented 1 month ago

python3 TTS/server/server.py --model_name tts_models/zh-CN/baker/tacotron2-DDC-GST

You're using a model that only supports Chinese. You'd have to synthesise the English parts with a different model.

Okay, thank you for your reply

coderczp commented 1 month ago

自己做分词,区别出中英文文本的位置,分别调用对应的模型。

好的,谢谢,我尝试换模型了,这样感觉有些麻烦

你换的哪个模型?问题解决了吗?

Airgods commented 1 month ago

自己做分词,区别出中英文文本的位置,分别调用对应的模型。

好的,谢谢,我尝试换模型了,这样感觉有些麻烦

你换的哪个模型?问题解决了吗?

解决了,最近阿里开源了一个cosy voice