vosen / ZLUDA

CUDA on non-NVIDIA GPUs
https://vosen.github.io/ZLUDA/
Apache License 2.0
9.52k stars 625 forks source link

Glaze support: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmEx #150

Open analytik opened 7 months ago

analytik commented 7 months ago

Thank you for this exciting project!

Using Glaze 1.1.1: https://glaze.cs.uchicago.edu/download.html

Using Radeon 6900XT, Windows 10, and latest current Radeon Pro drivers, 23.Q4, ZLUDA v3 and Glaze 1.1.1: image

Command line: zluda.exe -- Glaze.exe

Selected a 4k resolution jpg, Intensity: Default, Render Quality: Slower

transcript of the error since I cannot copy-paste it: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling 'cublasGemmEx( handle, opa, opb, m, n, k, ^falpha, a, CUDA_R_16F, Ida, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)' (not sure if "lda" is Lda or iDA)

lshqqytiger commented 7 months ago
  1. Replace path\to\Glaze\torch\lib\cublas64_11.dll with cublas.dll in this release of my fork.
  2. Open path\to\Glaze\transformers\__init__.py with editor.
  3. Add these lines.
    import torch
    torch.backends.cudnn.enabled = False
  4. Run Glaze.exe with zluda. image
analytik commented 7 months ago

Hi, thank you for your reply. Sadly I did not manage to get it run. I did the steps you mentioned.

At first it was exiting that it's missing hiprtc0507.dll, and after I installed AMD HIP SDK and copied the DLL over, it now says that cublas "or one if its dependencies" is missing.

I ran Process Monitor and tried to guess what's going on (I'm not a Windows/C++ developer), and it seems it's missing rocsolver.dll, rocblas.dll and other roc*.dll, so I copied them over from the c:\Program Files\AMD\ROCm\5.7\bin\. Then Glaze starts correctly, but a few seconds after it starts glazing, it exits unceremoniously without an error.

I can upload the Process Monitor dump somewhere, if it helps, but I couldn't find anything immediately obviously wrong. Can't tell if it's ZLUDA or Glaze related. It seems to write a bunch of data into C:\Users\username\AppData\Local\ZLUDA\ComputeCache\zluda.db and then it gets RANGE_NOT_LOCKED, and then it tries to access D:\Tensile\library (which has never existed on my computer, D: being the drive I run Glaze from) and then Glaze\library which also doesn't exist. WerFault.exe is spawned 250ms after that, so I'm guessing it dies from one of those things.

image

If I can help troubleshoot, I will gladly help, but Glaze itself is not that important for me right now, so if it's a rare bug, it might not be worth spending too much time on.

(I should mention that I think Glaze 1.1.1 managed to silently crash on me before completing its work even with the CPU-only run, before modifying transformers/init.py or changing any of the DLLs.)

lshqqytiger commented 7 months ago

Make sure that you have AMD HIP SDK 5.7. Add %HIP_PATH%bin and path to ZLUDA is in Path like this: From environment variables image From Path image