It would be great to minimize the dependencies that need to be installed for the inference demo.
We need a lot more things for training but the inference code should be pretty self-contained (we should be able to drop WhisperX, Whisper, even the SpeechBrain model by default). This would speed up the Collab demo a lot.
It would be great to minimize the dependencies that need to be installed for the inference demo.
We need a lot more things for training but the inference code should be pretty self-contained (we should be able to drop WhisperX, Whisper, even the SpeechBrain model by default). This would speed up the Collab demo a lot.
It would be good to also freeze the versions so we avoid surprise warnings like these: https://github.com/collabora/WhisperSpeech/discussions/35