miguelvalente / whisperer

Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.
131 stars 12 forks source link

Diarization not using GPU #47

Open Kenji776 opened 4 months ago

Kenji776 commented 4 months ago

Hello, first off thank you for your awesome work on this project, it's very cool and I'm looking forward to getting it working correctly. I've got the install all done and it seems to be technically working but I'm finding that the diarization does not seem to be using my GPU. I do have an RTX 4090, CUDA is installed (both v12.0 and v12.4). The process took approximately 10 minutes for a 23 minute long file during which there was no reported load on the GPU but the CPU was nearly entirely used. Is this expected, or am I correct in assuming that the GPU was not in use? Is there anything I should check or adjust?

This is the start of the output of the command image

Here is the output from nvidia-smi while the process is running.

image

Task manager showing minimal GPU use during the process image

Thanks again!

Kenji776 commented 4 months ago

I figured it out. I only recently upgraded to a CUDA enabled card but had not updated pytorch with a CUDA enabled install. Using the install command provided by https://pytorch.org/get-started/locally/ I was able to update my install of torch and successfully run the diarization using my GPU.

miguelvalente commented 3 months ago

Good to hear you managed it. There's still an issue where due to my laziness the Hugging Face token was hardcoded. It would be nice if there was a way this token wasn't hardcoded and could be set/provided by a user.