MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.75k stars 329 forks source link

Time taken seems longer #256

Closed deepakitkar closed 1 month ago

deepakitkar commented 1 month ago

I am using 8GB GPU with 32GB CPU intel i5 12 Gen. For an audio file of 485kb with 4min conversation, it is taking 105s. Is this normal?

MahmoudAshraf97 commented 1 month ago

Processing a 30m file takes around 2 to 3 minutes for me on an rtx 3070 ti You should investigate where the slowdown on your setup comes from so I can assist with it

deepakitkar commented 1 month ago

I see it is taking longer at this command and I have internet disabled.

_[NeMo W 2024-10-22 12:25:31 nemo_logging:349] /root/anaconda3/envs/whisper_diar2/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
      warnings.warn(_
MahmoudAshraf97 commented 1 month ago

this is an issue with model loading from disk, you need to use a faster disk or find a faster way to load the model without internet