Closed kurukurukuru closed 9 months ago
Any ideas?
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
My M40 24g runs ExLlama the same way, 4060ti 16g works fine under cuda12.4. M40 seems that the author did not update the kernel compatible with it, I also asked for help under the ExLlama2 author yesterday, I do not know whether the author to fix this compatibility problem, M40 and 980ti with the same architecture core computing power 5.2 to meet cuda12.4
Describe the bug
-edit- This is using Exllama. Did some more testing, and loading a model via llama.cpp and offloading to GPU works as expected.
I am trying to use 2x Tesla K80s, however when trying to load the model I get the error 'CUDA error: no kernel image is available for execution on the device'. The model does load in VRAM, but it seems it is failing at the stage after loading. I am currently using the repo drivers on Ubuntu 20.04 (sudo apt install nvidia-driver-470). The most recent CUDA version this GPU can run is 11.4. Tried driver 450, 460, each with their respective CUDA versions. No luck, same error. As per NVIDIA documentation (https://docs.nvidia.com/deploy/cuda-compatibility/), things should be compatible between minor versions?
Tried in both manual installation, and one-click. No dice. I can see that for GPUs <=CC 3.5, a different Torch package is needed. See: https://blog.nelsonliu.me/2020/10/13/newer-pytorch-binaries-for-older-gpus/ From what I've read, CC3.7 (K80) support is retained in current Torch versions. I tried the package anyways, doesn't seem like Torch 1.13 is compatible with the latest release.
I'm kinda at a loss here. Not sure what to do next.
Is there an existing issue for this?
Reproduction
-Install NVIDIA repo drivers with 'sudo apt install nvidia-driver-470' -Install oobabooga w/ CUDA 11.8 for Keplar GPUs -Try and launch a model
Screenshot
N/A
Logs
System Info