vosen / ZLUDA

CUDA on non-NVIDIA GPUs
https://vosen.github.io/ZLUDA/
Apache License 2.0
9.74k stars 636 forks source link

ZLUDA with faster-whisper: Could not load library libcudnn_ops_infer.so.8 #215

Open Jarauvi opened 6 months ago

Jarauvi commented 6 months ago

It was very promising that faster-whisper actually started, but when audio is being processed, I get this:

INFO:main:Ready INFO:faster_whisper:Processing audio with duration 00:01.530 Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory Traceback (most recent call last): File "/home/jarno/wyoming-faster-whisper/script/run", line 15, in subprocess.check_call([context.env_exe, "-m", "wyoming_faster_whisper"] + sys.argv[1:]) File "/home/jarno/miniforge3/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/home/jarno/wyoming-faster-whisper/.venv/bin/python3', '-m', 'wyoming_faster_whisper', '--model', 'whisper-large-finnish-v3', '--device', 'cuda', '--beam-size', '5', '--compute-type', 'int8', '--language', 'fi', '--uri', 'tcp://0.0.0.0:10300', '--data-dir', '/data', '--download-dir', '/data']' died with <Signals.SIGABRT: 6>.

Is there any workaround for this or is the missing library something that could be added to zluda?

mkiol commented 4 months ago

Also, I tried faster-whisper together with ZLUDA. It looks that ctranslate2 (library behind faster-whisper) needs something from CUDNN that is not yet implemented.

Here are the details:

thread '<unnamed>' panicked at 'not implemented', zluda_dnn/src/cudnn_v8.rs:24:5
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/panicking.rs:65:14
   2: core::panicking::panic
             at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/panicking.rs:115:5
   3: cudnnGetErrorString
   4: _ZNK11ctranslate23ops6Conv1D7computeILNS_6DeviceE1EfEEvRKNS_11StorageViewES6_PS5_RS4_S7_
             at /run/build/ctranslate2-zluda/src/ops/conv1d_gpu.cu:37:428
   5: _ZNK11ctranslate23ops6Conv1DclERKNS_11StorageViewES4_PS3_RS2_S5_
             at /run/build/ctranslate2-zluda/src/ops/conv1d.cc:45:7
   6: _ZNK11ctranslate23ops6Conv1DclERKNS_11StorageViewES4_S4_RS2_PS3_
             at /run/build/ctranslate2-zluda/src/ops/conv1d.cc:20:17
   7: _ZNK11ctranslate26layers6Conv1DclERKNS_11StorageViewERS2_
             at /run/build/ctranslate2-zluda/src/layers/common.cc:456:17
   8: _ZN11ctranslate26layers14WhisperEncoderclERKNS_11StorageViewERS2_
             at /run/build/ctranslate2-zluda/src/layers/whisper.cc:46:13
   9: _ZN11ctranslate26models14WhisperReplica6encodeENS_11StorageViewEb
             at /run/build/ctranslate2-zluda/src/models/whisper.cc:90:18
  10: operator()
             at /run/build/ctranslate2-zluda/src/models/whisper.cc:657:60
  11: operator()
             at /run/build/ctranslate2-zluda/include/ctranslate2/replica_pool.h:65:29
  12: operator()
             at /run/build/ctranslate2-zluda/include/ctranslate2/replica_pool.h:93:41
  13: run
             at /run/build/ctranslate2-zluda/include/ctranslate2/replica_pool.h:282:19
  14: _ZN11ctranslate26Worker3runERNS_8JobQueueE
             at /run/build/ctranslate2-zluda/src/thread_pool.cc:119:15
  15: _ZSt13__invoke_implIvMN11ctranslate26WorkerEFvRNS0_8JobQueueEEPS1_JSt17reference_wrapperIS2_EEET_St21__invoke_memfun_derefOT0_OT1_DpOT2_
             at /usr/include/c++/13.2.0/bits/invoke.h:74:46
  16: _ZSt8__invokeIMN11ctranslate26WorkerEFvRNS0_8JobQueueEEJPS1_St17reference_wrapperIS2_EEENSt15__invoke_resultIT_JDpT0_EE4typeEOSA_DpOSB_
             at /usr/include/c++/13.2.0/bits/invoke.h:96:40
  17: _ZNSt6thread8_InvokerISt5tupleIJMN11ctranslate26WorkerEFvRNS2_8JobQueueEEPS3_St17reference_wrapperIS4_EEEE9_M_invokeIJLm0ELm1ELm2EEEEvSt12_Index_tupleIJXspT_EEE
             at /usr/include/c++/13.2.0/bits/std_thread.h:292:26
  18: _ZNSt6thread8_InvokerISt5tupleIJMN11ctranslate26WorkerEFvRNS2_8JobQueueEEPS3_St17reference_wrapperIS4_EEEEclEv
             at /usr/include/c++/13.2.0/bits/std_thread.h:299:20
  19: _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJMN11ctranslate26WorkerEFvRNS3_8JobQueueEEPS4_St17reference_wrapperIS5_EEEEEE6_M_runEv
             at /usr/include/c++/13.2.0/bits/std_thread.h:244:20

Following ctranslate2 line causes the error: https://github.com/OpenNMT/CTranslate2/blob/39f48f2e843df52245e6c857326e1115bca12b03/src/ops/conv1d_gpu.cu#L37.