Closed 01098996 closed 9 months ago
Does it behave like this in Edge Browser as well?
Does it behave like this in Edge Browser as well?
In Edge Browser, it synthesizes Chinese text to speech fluently and naturally, without any word-by-word choppiness.
I'm sorry, this is my mistake. The code logic was inserting spaces between each character when passing the string.
When using edge-tts for Chinese text-to-speech conversion, I found that the output speech is pieced together word by word, rather than output naturally and continuously.
This word-by-word output method severely affects the fluency and naturalness of the speech, making it sound very unnatural. I don't know where the problem is.
Steps to reproduce:
subprocess.call([
"edge-tts",
"--voice", "zh-CN-XiaoxiaoNeural",
"--text", "文字再测一下",
"--write-media", "/tmp/output.wav",
])
Play the conversion result
It can be clearly heard that the speech is output word by word, rather than continuous natural speech.
Expected behavior:
The speech conversion result should be a continuous natural output of the whole sentence, rather than pieced together word by word.
Environment:
OS: Ubuntu 20.04
Edge-tts version: edge-tts 6.1.8
Test text: 文字再测一下
Please let me know if any other details are needed to reproduce this issue. Look forward to more natural and fluent output. Thanks!
Incorrect audio file: https://drive.google.com/file/d/1Wod4IWhD8oicEdHL8hkpPUCTT7hcG6ca/view?usp=drive_link