vosen / ZLUDA

CUDA on AMD GPUs
Apache License 2.0
8.28k stars 486 forks source link

Does Not Work with Whisper - CUDA 11.8 and CUDA 12.2 #111

Open AndrewJacksonZA opened 4 months ago

AndrewJacksonZA commented 4 months ago

Hello

ZLUDA fails to execute Whisper.cpp ( https://github.com/ggerganov/whisper.cpp/releases/tag/v1.5.4 ) I tried with the "whisper-cublas-11.8.0-bin-x64.zip" download.

Error message:

CUDA error: invalid argument
  current device: 0, in function ggml_init_cublas at D:\a\whisper.cpp\whisper.cpp\ggml-cuda.cu:6843
  cuDeviceGetAttribute(&device_vmm, CU_DEVICE_ATTRIBUTE_VIRTUAL_MEMORY_MANAGEMENT_SUPPORTED, device)
GGML_ASSERT: D:\a\whisper.cpp\whisper.cpp\ggml-cuda.cu:226: !"CUDA error"

Steps to replicate:

  1. Download whisper-cublas-11.8.0-bin-x64.zip from https://github.com/ggerganov/whisper.cpp/releases/tag/v1.5.4
  2. Download the "Samples" and "Models" directories from https://github.com/ggerganov/whisper.cpp
  3. Download a model by running \models\download-ggml-model.cmd e.g. download-ggml-model.cmd tiny.en (you can store the models and samples anywhere, as long as the paths are then specified.)
  4. Call the EXE using ZLUDA e.g. zluda.exe -- T:\whisper-cublas-11.8.0-bin-x64\main.exe -pp -t 8 -m T:\WhisperModels\ggml-medium.en.bin T:\WhisperSamples\jfk.wav
    • "-pp" = show percent,
    • "-t 8" = use 8 cores,
    • "-m" is the model to use,
    • and then finally the wav file.
  5. I also tried without specifying the percent and threads to use, same result. zluda.exe -- T:\whisper-cublas-11.8.0-bin-x64\main.exe -m T:\WhisperModels\ggml-medium.en.bin T:\WhisperSamples\jfk.wav
  6. In case it matters:

Full output:

T:\zluda>zluda.exe -- T:\whisper-cublas-11.8.0-bin-x64\main.exe -pp -t 8 -m T:\WhisperModels\ggml-medium.en.bin T:\WhisperSamples\jfk.wav
whisper_init_from_file_with_params_no_state: loading model from 'T:\WhisperModels\ggml-medium.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
CUDA error: invalid argument
  current device: 0, in function ggml_init_cublas at D:\a\whisper.cpp\whisper.cpp\ggml-cuda.cu:6843
  cuDeviceGetAttribute(&device_vmm, CU_DEVICE_ATTRIBUTE_VIRTUAL_MEMORY_MANAGEMENT_SUPPORTED, device)
GGML_ASSERT: D:\a\whisper.cpp\whisper.cpp\ggml-cuda.cu:226: !"CUDA error"

Hardware specs:

I also tried with the CUDA 12.2 build whisper-cublas-12.2.0-bin-x64.zip and it also failed:

ggml_init_cublas: found 1 CUDA devices:
CUDA error: invalid argument
  current device: 0, in function ggml_init_cublas at D:\a\whisper.cpp\whisper.cpp\ggml-cuda.cu:6843
  cuDeviceGetAttribute(&device_vmm, CU_DEVICE_ATTRIBUTE_VIRTUAL_MEMORY_MANAGEMENT_SUPPORTED, device)
GGML_ASSERT: D:\a\whisper.cpp\whisper.cpp\ggml-cuda.cu:226: !"CUDA error"

Please let me know if you need anything more from me to help resolve this issue.

Thank you Andrew Jackson

vricosti commented 4 months ago

Same issue here, only difference with description above is that I have installed HIP for Windows (since zluda uses it to access to amd gpu) : https://www.amd.com/fr/developer/resources/rocm-hub/hip-sdk.html But maybe zluda doesn't find it ?

patientx commented 2 months ago

Same issue here, only difference with description above is that I have installed HIP for Windows (since zluda uses it to access to amd gpu) : https://www.amd.com/fr/developer/resources/rocm-hub/hip-sdk.html But maybe zluda doesn't find it ?

rename or archive these two files inside whisper folder : cublas64_11.dll & cusparse64_11.dll.

Go to your ZLUDA folder, copy these files from your zluda folder into whisper folder, renaming (and overwriting) as such; cublas.dll -> cublas64_11.dll cusparse.dll -> cusparse64_11.dll

this way it works.

whisper