collabora / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.
https://collabora.github.io/WhisperSpeech/
MIT License
4.01k stars 218 forks source link

Minimize and stabilize the inference dependencies #38

Closed jpc closed 10 months ago

jpc commented 10 months ago

It would be great to minimize the dependencies that need to be installed for the inference demo.

We need a lot more things for training but the inference code should be pretty self-contained (we should be able to drop WhisperX, Whisper, even the SpeechBrain model by default). This would speed up the Collab demo a lot.

It would be good to also freeze the versions so we avoid surprise warnings like these: https://github.com/collabora/WhisperSpeech/discussions/35

jpc commented 10 months ago

This is done.