Open Khaztaroth opened 11 months ago
Extra extra testing:
Naively I had updated to the latest version of Pytorch through pip rather than conda. I'm not sure what the difference is under the hood, since it doesn't throw any errors or warning when running Whisper. However it causes diarization to take several hours longer and use 3x the memory.
Creating the environment from scratch making sure to use conda for Pytorch yielded the expected results.
A side effect of this seems to be that WhisperX can't be used outside of a conda environment, preventing it from being comfortably integrated into things like Subtitle Edit, which can now use Whisper and it's variants to automatically create subtitles.
All different ways of installing/using WhisperX and running it from a default windows prompt has the same problem of not correctly using the GPU for transcriptions or diarization, In Subtitle Edit, it returns a single period character instead of a proper transcription like vanilla Whisper or even Faster-Whisper.
I know this repo is more of a proof of concept than a tool that is intended for mass-use, However it does consistently yield results that I'm more happy with than other Whisper forks. It would be useful for it to work outside of a conda environment.
Darization runs very slowly, uses almost 12gb of memory, and is seemingly not happening on the GPU (GPUz and Window's task manager show conflicting info)
On interrupting the diarization step, the last call shows the following segment of code, it points to something happening on the CPU but I'm not sure if it's the main process. Admittedly I don't understand python code very well.
Extra testing:
It seems that, at least in my particular setup, the diarization model couldn't access the dedicated gpu over the integrated one. Setting my system to only use the dedicated GPU for everything ensured that it ran on it.
Memory usage is still high, and it takes much longer than previously. However those could very well be issues with the diarization model and not whisperx's implementation.