Closed niansa closed 3 weeks ago
This is very odd. The prebuilt wheels compiled fine for ROCm 5.6?
Yes. I only have 5.6 in Docker right now though, so there may actually be other environmental factors.
how to check if it is using ROCm or not ? Please help
how to check if it is using ROCm or not ? Please help
Create a question in the discussions and don't hijack issues, thanks
So I've been tinkering a bit with an AMD GPU now, and I can confirm the prebuilt wheel doesn't work with Torch 2.2. The basic requirements for building from source are:
rocm-hip-sdk
in Arch, I think the same name in Ubuntu)pip3 install torch --index-url https://download.pytorch.org/whl/rocm5.7
git clone
the repopip install .
in the exllamav2 directoryI'll bump the Torch requirement to 2.2.x for the next release, as soon as I've had a little time to try it out.
@turboderp tell me what changes you want to make I will make it for you in a PR. please tell me file name and content you want to change . Thanks
@hemangjoshi37a Huh?
I'll bump the Torch requirement to 2.2.x for the next release, as soon as I've had a little time to try it out.
this
Well it's not changing the requirement that's the issue, that takes ten seconds. It's testing that everything actually works with Torch 2.2 first, since there are obviously some breaking changes in at least the C++ extension features.
So I've been tinkering a bit with an AMD GPU now, and I can confirm the prebuilt wheel doesn't work with Torch 2.2. The basic requirements for building from source are:
* Install the ROCm HIP SDK package (`rocm-hip-sdk` in Arch, I think the same name in Ubuntu) * Install the ROCm version of Torch: `pip3 install torch --index-url https://download.pytorch.org/whl/rocm5.7` * `git clone` the repo * `pip install .` in the exllamav2 directory
I'll bump the Torch requirement to 2.2.x for the next release, as soon as I've had a little time to try it out.
Doing so on my Fedora resulted in CUDA_HOME environment variable is not set.
Have you done something specific related to HIP
to makes it compile @turboderp ?
I literally only did those four steps, on a brand new install of Manjaro. Is it possible you have a CUDA version of Torch installed? Perhaps you could try in a fresh venv?
In fact I went a little bit further (and no I got the 2.2.2+rocm5.7
torch version).
But I got inspired by your Github CI. So I have exposed the following env vars:
HSA_OVERRIDE_GFX_VERSION=10.3.0 # 6800xt
USE_ROCM=1
ROCM_VERSION=5.7
ROCM_PATH=/usr/bin
Tho I am not sure about the last one cause the HIP package provided by Fedora differs a little bit compared to the one provided by Arch, but I can see a hipcc
there so..
But now I'm facing the following issue: hip/hip_runtime_api.h: No such file or directory
I did search it with find / -iname *hip_runtime_api.h* 2>/dev/null
And I have this result:
/var/mnt/data/Projects/Misc/AI/tools/exllamav2/.venv/lib/python3.11/site-packages/triton/third_party/hip/include/hip/nvcc_detail/nvidia_hip_runtime_api.h
/var/mnt/data/Projects/Misc/AI/tools/exllamav2/.venv/lib/python3.11/site-packages/triton/third_party/hip/include/hip/hip_runtime_api.h
/var/mnt/data/Projects/Misc/AI/tools/exllamav2/.venv/lib/python3.11/site-packages/triton/third_party/hip/include/thrust/system/hip/detail/guarded_hip_runtime_api.h
So to me it should be ok..
Edit: I've ended up using distrobox/toolbox (I tried both just to compare but that's another story) with an archlinux like you and effectively the process was flawless so I guess it's a skill issue on my side not knowing how to do it under fedora (if ever someone makes it compile I would be curious to know how).
Anyway loading models with this project is a real pleasure. Thanks a lot for the work!
I'm going to close this as completed now, I guess. Please don't hesitate to report any other issues (or this one if it pops up again) related to ROCm, since I can still only do limited testing here.
Hi!
Either something has recently changed here or ROCm broke something, but exllamav2 breaks on ROCm 6.0 for me:
log.txt
Thanks!