turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.22k stars 238 forks source link

Fails to compile with ROCm #298

Closed niansa closed 3 weeks ago

niansa commented 5 months ago

Hi!

Either something has recently changed here or ROCm broke something, but exllamav2 breaks on ROCm 6.0 for me:

log.txt

Thanks!

niansa commented 5 months ago

Same errors with ROCm 5.6: log.txt

turboderp commented 5 months ago

This is very odd. The prebuilt wheels compiled fine for ROCm 5.6?

niansa commented 5 months ago

Yes. I only have 5.6 in Docker right now though, so there may actually be other environmental factors.

hemangjoshi37a commented 5 months ago

how to check if it is using ROCm or not ? Please help

niansa commented 5 months ago

how to check if it is using ROCm or not ? Please help

Create a question in the discussions and don't hijack issues, thanks

turboderp commented 5 months ago

So I've been tinkering a bit with an AMD GPU now, and I can confirm the prebuilt wheel doesn't work with Torch 2.2. The basic requirements for building from source are:

I'll bump the Torch requirement to 2.2.x for the next release, as soon as I've had a little time to try it out.

hemangjoshi37a commented 5 months ago

@turboderp tell me what changes you want to make I will make it for you in a PR. please tell me file name and content you want to change . Thanks

turboderp commented 5 months ago

@hemangjoshi37a Huh?

hemangjoshi37a commented 5 months ago

I'll bump the Torch requirement to 2.2.x for the next release, as soon as I've had a little time to try it out.

this

turboderp commented 5 months ago

Well it's not changing the requirement that's the issue, that takes ten seconds. It's testing that everything actually works with Torch 2.2 first, since there are obviously some breaking changes in at least the C++ extension features.

RichardFevrier commented 3 months ago

So I've been tinkering a bit with an AMD GPU now, and I can confirm the prebuilt wheel doesn't work with Torch 2.2. The basic requirements for building from source are:

* Install the ROCm HIP SDK package (`rocm-hip-sdk` in Arch, I think the same name in Ubuntu)

* Install the ROCm version of Torch: `pip3 install torch --index-url https://download.pytorch.org/whl/rocm5.7`

* `git clone` the repo

* `pip install .` in the exllamav2 directory

I'll bump the Torch requirement to 2.2.x for the next release, as soon as I've had a little time to try it out.

Doing so on my Fedora resulted in CUDA_HOME environment variable is not set.

Have you done something specific related to HIP to makes it compile @turboderp ?

turboderp commented 3 months ago

I literally only did those four steps, on a brand new install of Manjaro. Is it possible you have a CUDA version of Torch installed? Perhaps you could try in a fresh venv?

RichardFevrier commented 3 months ago

In fact I went a little bit further (and no I got the 2.2.2+rocm5.7 torch version).

But I got inspired by your Github CI. So I have exposed the following env vars:

HSA_OVERRIDE_GFX_VERSION=10.3.0 # 6800xt
USE_ROCM=1
ROCM_VERSION=5.7
ROCM_PATH=/usr/bin

Tho I am not sure about the last one cause the HIP package provided by Fedora differs a little bit compared to the one provided by Arch, but I can see a hipcc there so..

But now I'm facing the following issue: hip/hip_runtime_api.h: No such file or directory I did search it with find / -iname *hip_runtime_api.h* 2>/dev/null And I have this result:

/var/mnt/data/Projects/Misc/AI/tools/exllamav2/.venv/lib/python3.11/site-packages/triton/third_party/hip/include/hip/nvcc_detail/nvidia_hip_runtime_api.h
/var/mnt/data/Projects/Misc/AI/tools/exllamav2/.venv/lib/python3.11/site-packages/triton/third_party/hip/include/hip/hip_runtime_api.h
/var/mnt/data/Projects/Misc/AI/tools/exllamav2/.venv/lib/python3.11/site-packages/triton/third_party/hip/include/thrust/system/hip/detail/guarded_hip_runtime_api.h

So to me it should be ok..

Edit: I've ended up using distrobox/toolbox (I tried both just to compare but that's another story) with an archlinux like you and effectively the process was flawless so I guess it's a skill issue on my side not knowing how to do it under fedora (if ever someone makes it compile I would be curious to know how).

Anyway loading models with this project is a real pleasure. Thanks a lot for the work!

turboderp commented 3 weeks ago

I'm going to close this as completed now, I guess. Please don't hesitate to report any other issues (or this one if it pops up again) related to ROCm, since I can still only do limited testing here.