Closed ardfork closed 10 months ago
This makes sense, though I have no way to test it. Is there a performance impact?
Is there a performance impact?
Not that I could notice, benchmark results are identical on this PR or master.
Is it backwards-compatible? I.e. do I just change the build environment to ROCM_VERSION=5.5
and it should work for people still running 5.6?
Huh, I think you should have let it to 5.6, that what most distro use and what PyTorch use. 5.5 is just when HIPBLAS_USE_HIP_HALF
was introduced.
Ooh.. I thought it was about something being deprecated in 5.6. I'll just revert then.
No, originally that code came from exllama v1 where latest was 5.4 at the time, I believe. Then came exllama v2 and I copied hipblas compatibility code without knowing that HIPBLAS_USE_HIP_HALF
was introduced.
Also, I wanted to modify README, to maybe add some ROCm instruction (adding --extra-index-url https://download.pytorch.org/whl/rocm5.6
to pip command), but I don't think the instruction even work if you don't already have pytorch installed. python setup.py install --user
just fail since it try to import stuff from torch.
I guess PyTorch ran into the same issue I did, being that the main PyPi index doesn't support multiple variants of one package. I'm not sure what the best approach is, though. Should there be one requirements.txt
for every CUDA or ROCm version, split into Windows and Linux..? I guess that only works out to six files at the moment.
But as long as you have a version of PyTorch >=2.1.0, the requirement is satisfied, at least.
Since #22 didn't seem to fix using exllamav2 on ROCm < 5.6 as reported on #46. I think it's better to use
HIPBLAS_USE_HIP_HALF
which require ROCm 5.5.0; that will make less code used only for HIP compatibility.