Open csukuangfj opened 6 months ago
@csukuangfj currently, which model sounds close to human quality on sherpa onnx? Coqui or piper tts models? And are these two the only shpera onnx supports?
Please visit https://huggingface.co/spaces/k2-fsa/text-to-speech to try all supported tts models.
There are more than 100 tts models and the best way to find out which model sounds best to you is to try it by yourself. You don't need to install anything to try it.
And are these two the only shpera onnx supports?
No.
shepra-onnx currently supports VITS tts models and it is not limited to coqui or piper.
Please visit
https://huggingface.co/spaces/k2-fsa/text-to-speech
to try all supported tts models.
There are more than 100 tts models and the best way to find out which model sounds best to you is to try it by yourself.
You don't need to install anything to try it.
I tried a couple of them in the past actually. I was hoping you'd have a "top 3" model list. What I noticed with sherpa onnx is there's a trade off between quality & on-device processing compared to cloud solutions out there. Example standard coqui tts models sound okay but once converted to sherpa onnx the quality and intonation goes down. Are there any tips or tricks to get a good quality on sherpa onnx?
Example standard coqui tts models sound okay but once converted to sherpa onnx the quality and intonation goes down
Could you describe which model you are using? @nanaghartey
Example standard coqui tts models sound okay but once converted to sherpa onnx the quality and intonation goes down
Could you describe which model you are using? @nanaghartey
I'm using my own fine tuned coqui and piper tts vits models. Both sound good before converting to sherpa onnx...but this is the case for the various other English models I tried out
FYI: We have supported piper models in https://github.com/k2-fsa/sherpa-onnx
Note that it does not depend on https://github.com/rhasspy/piper-phonemize
sherpa-onnx supports a variety of platforms, such as
It also provides various programming language APIs, e.g., C/C++/Python/Kotlin/Swift/C#/Go. We also have android APKs for TTS.
You can find the installation doc at https://k2-fsa.github.io/sherpa/onnx/install/index.html
You can find the usage of piper models with sherpa-onnx at https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#lessac-blizzard2013-medium-english-single-speaker
We also have a huggingface space for you to try piper models with sherpa-onnx. Please visit https://huggingface.co/spaces/k2-fsa/text-to-speech
You can find the PR supporting piper in sherpa-onnx at https://github.com/k2-fsa/sherpa-onnx/pull/390