k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.71k stars 430 forks source link

don't make connections #435

Closed MXC48 closed 12 months ago

MXC48 commented 1 year ago

the TTS is great but when he speaks in French he doesn't make the connections between words.

csukuangfj commented 1 year ago

the TTS is great but when he speaks in French he doesn't make the connections between words.

which model are you using?

Could you explain what "connections" means? would be great if you can post an example audio with the corresponding text.

MXC48 commented 1 year ago

I use the UMPC medium model, but it works for all of them. In French, a liaison (connection) is the last letter of a word that can influence the next one.

A video that explains : https://www.youtube.com/watch?v=yRCD8vgohZo

csukuangfj commented 1 year ago

https://www.youtube.com/watch?v=yRCD8vgohZo

Thanks for the link. The video is very informative.

I think we have fixed the issue by #453

The following is a sample wav generated by the following command

python3 ./python-api-examples/offline-tts.py \
  --vits-model=./vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx \
  --vits-tokens=./vits-piper-fr_FR-siwis-low/tokens.txt \
  --vits-data-dir=./vits-piper-fr_FR-siwis-low/espeak-ng-data \
  --debug=1 \
  --output-filename=./siwis-fr.wav \
  "les oranges"

https://github.com/k2-fsa/sherpa-onnx/assets/5284924/cbe511be-3296-46ac-9c2d-24bb51a4488b


You can find the models at https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models

Screenshot 2023-11-29 at 17 30 47

csukuangfj commented 1 year ago

@MXC48

Please try it using our huggingface space. https://huggingface.co/spaces/k2-fsa/text-to-speech

MXC48 commented 1 year ago

@MXC48

Please try it using our huggingface space. https://huggingface.co/spaces/k2-fsa/text-to-speech

yes that's much better thank you very much !

csukuangfj commented 1 year ago

Shall I mark it as fixed now?

MXC48 commented 11 months ago

Shall I mark it as fixed now?

Yes you can !