rhasspy / larynx

End to end text to speech system using gruut and onnx
MIT License
822 stars 49 forks source link

Question about quality of voice #68

Closed thevickypedia closed 1 year ago

thevickypedia commented 1 year ago

Hello,

I'm looking for some clarification towards getting a text converted to speech in low quality. I run larynx in a docker container and use python's requests module to make a post call to the endpoint 0.0.0.0:5002/api/tts to get the audio output.

requests.post(
    url="http://0.0.0.0:5002/api/tts", params={"voice": "en-us_northern_english_male-glow_tts"},
    data='Welcome to the world of speech synthesis!', headers={"Content-Type": "text/plain"}, verify=False
)

I tried adding 'quality': 'low' as part of the params but it keeps defaulting to VocoderQuality.HIGH

Your readme says,

You can specify the vocoder quality by adding ; to the MaryTTS voice where QUALITY is "high", "medium", or "low".

I naively tried using params={"voice": "en-us_northern_english_male-glow_tts;low"} but ended up getting the following error since the source URL is malformed by it,

ERROR:larynx:Failed to download voice en-us_northern_english_male-glow_tts;low from http://github.com/rhasspy/larynx/releases/download/v1.0/en-us_northern_english_male-glow_tts;low.tar.gz: HTTP Error 404: Not Found

Any clarity on this would be much helpful. Thank you.

thevickypedia commented 1 year ago

After looking into the docker logs, I figured it is simply vcoder param

requests.post(
    url="http://0.0.0.0:5002/api/tts", params={"voice": "en-us_northern_english_male-glow_tts", "vocoder": "low"},
    data='Welcome to the world of speech synthesis!', headers={"Content-Type": "text/plain"}, verify=False
)

Hence, closing the issue.