MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
2.53k stars 243 forks source link

Kernel crash in Juptyer notebook while processing lengthy audio. #106

Closed manjunath7472 closed 2 months ago

manjunath7472 commented 9 months ago

Hi there, My Specs RTX 4000 - 8GB Cuda 11.8 - Cudnn 8.9.1 32GB Ram model used- large v2 - float 16 input audio length: 45 min

When audio Input length is more than 30 min kernal crashing. tried medium.en with int8 compute. Still same issue. Even process stopping in normal run too.

MahmoudAshraf97 commented 9 months ago

in which step does it crash?

manjunath7472 commented 9 months ago

In transcribe cell, we need to remove below lines to avoid kernel crash while transcribing. It also solves problem where all dialogues are stuffed under one speaker.

del whisper_model torch.cuda.empty_cache()