[Q] How to use 7lang model?

collabora / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

MIT License

3.95k stars 215 forks source link

Dear WhisperSpeech maintainers,

I found multi-language models like s2a-v1.95-medium-7lang.model on huggingface. When trying to use them with example/text_to_audio_playback.py by setting model_ref = "collabora/whisperspeech:s2a-v1.95-medium-7lang.model" I only get strange sounding voice output. The default models sound fine.

It is supposed to work or what am I missing?

Thanks a lot,

My system setup:

Gentoo Linux
ROCm 6.1 with Radeon 7900 XTX
Pytorch 2.4 with ROCm 6.1 support
python 3.12

collabora / WhisperSpeech

[Q] How to use 7lang model? #148

My system setup: