collabora / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.
https://collabora.github.io/WhisperSpeech/
MIT License
3.95k stars 215 forks source link

[Q] How to use 7lang model? #148

Open oleid opened 3 months ago

oleid commented 3 months ago

Dear WhisperSpeech maintainers,

I found multi-language models like s2a-v1.95-medium-7lang.model on huggingface. When trying to use them with example/text_to_audio_playback.py by setting model_ref = "collabora/whisperspeech:s2a-v1.95-medium-7lang.model" I only get strange sounding voice output. The default models sound fine.

It is supposed to work or what am I missing?

Thanks a lot,

My system setup:

PeterMesihaDev commented 3 days ago

There is s2a and t2s models. 7lang indicates that it is working with 7 languages. When using the generate_to method you can pass a property called lang which defines which language the output should be, if no lang property is passed, it will use en (englisch) as default.