Diarization process in Whisperx does not utilize GPU

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

BSD 2-Clause "Simplified" License

11.77k stars 1.24k forks source link

Diarization process in Whisperx does not utilize GPU #542

Open SerebryanskiySergei opened 11 months ago

SerebryanskiySergei commented 11 months ago

I have updated the package to the latest version with the merged 3.0.1 version of pyannite audio. However, I am still experiencing slow diarization processing times.

After checking the Task Manager, I noticed that only 1-2% of GPU usage is observed (I am using a 4080 GPU). In the same time transcription process takes almost 98% so I'm sure that the setup is correct.

Do you have any ideas on how to resolve this issue?

sam1am commented 11 months ago

See here: https://github.com/m-bain/whisperX/issues/499

SerebryanskiySergei commented 11 months ago

See here: #499

yeah, I've checked that before making an issue, thank you for pointing this out.

Maybe I didn't get the idea, but the author is talking there about 3.0.0 version that doesn't work on the GPU. However with the merged fix from the prev week, pyannnite.audio was updated to 3.0.1 and this version whould be OK with the GPU usage.

But the problem is still exists.

iamianM commented 11 months ago

Hi, thanks for bringing this up. I'm having the same issue. Was hoping the update would resolve, but still stuck using the CPU.

manjunath7472 commented 11 months ago

Assuming cuda and cudnn installed in PC.

Before installing whisperx, install pytorch for your cuda version installed in your pc. For cuda 11.8 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Then install whisperX

post install, confirm cuda by running below code.

import torch
print(torch.cuda.is_available())

SerebryanskiySergei commented 11 months ago

Assuming cuda and cudnn installed in PC.

Before installing whisperx, install pytorch for your cuda version installed in your pc. For cuda 11.8 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Then install whisperX

post install, confirm cuda by running below code.
import torch
print(torch.cuda.is_available())

I thought about it too, as an example I've built with cuda official example applications (deviceQuery), I was able to build and run it.