MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.28k stars 272 forks source link

Are there hardware dependencies? #37

Closed J-Curwell closed 1 year ago

J-Curwell commented 1 year ago

I followed the instructions in the README, and successfully installed all specified dependencies. I'm now trying to run the package from the command line. Using a Windows Surface 3 laptop running Windows 10 Pro, I run into this error:

❯ python diarize.py -a MY_FILE.mp3

[NeMo W 2023-05-05 12:07:02 optimizers:54] Apex was not found. Using the lamb or fused_adam optimizer will error out.
[NeMo W 2023-05-05 12:07:05 experimental:27] Module <class 'nemo.collections.asr.modules.audio_modules.SpectrogramToMultichannelFeatures'> is experimental, not ready for production and is not fully supported. Use at your own risk.
Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.
Source splitting failed, using original audio file. Use --no-stem argument to disable it.
Traceback (most recent call last):
  File "<MY LOCAL PATH>/diarize.py", line 56, in <module>
    whisper_model = WhisperModel(args.model_name, device="cuda", compute_type="float16")
  File "<MY VENVS FOLDER>\whisper-diarization\lib\site-packages\faster_whisper\transcribe.py", line 120, in __init__
    self.model = ctranslate2.models.Whisper(
RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version

Is this a hardware issue or am I missing something? Thank you!

MahmoudAshraf97 commented 1 year ago

The code is configured for nvidia gpus by default since it's a requirement by nvidia nemo, you can modify this by changing all device arguments to cpu