Open kc01-8 opened 2 months ago
PS F:\whisperX-main> whisperx audio.mp4 --model large-v2 --diarize --highlight_words True --min_speakers 5 --max_speakers 5 --hf_token hf_x C:\Users\kc01\AppData\Roaming\Python\Python310\site-packages\pyannote\audio\core\io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call. torchaudio.set_audio_backend("soundfile") Traceback (most recent call last): File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\kc01\AppData\Roaming\Python\Python310\Scripts\whisperx.exe\__main__.py", line 7, in <module> File "C:\Users\kc01\AppData\Roaming\Python\Python310\site-packages\whisperx\transcribe.py", line 170, in cli model = load_model(model_name, device=device, device_index=device_index, download_root=model_dir, compute_type=compute_type, language=args['language'], asr_options=asr_options, vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset}, task=task, threads=faster_whisper_threads) File "C:\Users\kc01\AppData\Roaming\Python\Python310\site-packages\whisperx\asr.py", line 288, in load_model model = model or WhisperModel(whisper_arch, File "C:\Users\kc01\AppData\Roaming\Python\Python310\site-packages\faster_whisper\transcribe.py", line 133, in __init__ self.model = ctranslate2.models.Whisper( ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.
Happens using a 3080ti, which works flawlessly with NVidia NeMo. Completely fresh install of whisperx.
What is the device you are passing? Are you sure it's 'GPU' and not 'CPU'. If I recall correctly this was a CPU only problem not with whisperx but faster-whisper under the hood. See for example this issue here https://github.com/SYSTRAN/faster-whisper/issues/65
If you are indeed sending the correct params for GPU use then I recommend running faster-whisper directly first to narrow down the problem. Make a .py file import the necessary starter code, you find it on faster-whisper's github and run with the verbose flag set
CT2_VERBOSE=1 time python3 main.py
should give more console output for debugging.
We can proceed from there to see what's wrong.
Happens using a 3080ti, which works flawlessly with NVidia NeMo. Completely fresh install of whisperx.