undefined symbol: hipGetDevicePropertiesR0600

alain40 commented 11 months ago

Getting this error after successful compilation on 7840U.

rocm 6.0 and PyTorch installed and tested.

Here is the error when running exllama which runs fine without flash attention: Traceback (most recent call last): File "/home/alain/exllamav2/examples/chat.py", line 5, in from exllamav2 import( File "/home/alain/.local/lib/python3.10/site-packages/exllamav2/init.py", line 3, in from exllamav2.model import ExLlamaV2 File "/home/alain/.local/lib/python3.10/site-packages/exllamav2/model.py", line 21, in from exllamav2.attn import ExLlamaV2Attention File "/home/alain/.local/lib/python3.10/site-packages/exllamav2/attn.py", line 19, in import flash_attn File "/home/alain/.local/lib/python3.10/site-packages/flash_attn/init.py", line 3, in from flash_attn.flash_attn_interface import flash_attn_func File "/home/alain/.local/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 4, in import flash_attn_2_cuda as flash_attn_cuda ImportError: /home/alain/.local/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: hipGetDevicePropertiesR0600

Otherwise HIP runtime looks fine: hipconfig.txt

rocm also looks OK (using gfx 11.0.0 override): rocminfo.txt

TNT3530 commented 10 months ago

Also having this issue when attempting to build from source + benchmark on gfx908 (Instinct MI100) and ROCm 6.0. Running any version I built (including torch 2.0.1, 2.1.1, and 2.2.0) gives the above error.

Running/building the docker images works fine, so I assume its an issue with ROCm 6.0

nayn99 commented 9 months ago

Same issue on 4650G.

The compilation is successful, but when loading the library in python, I get ~/coco/llm/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: hipGetDevicePropertiesR0600

Otherwise I am able to install xformers, and pytorch works fine. This seems the last ingredient to getting vllm running.

tcgu-amd commented 3 days ago

Hi @nayn99 @alain40, sorry for the lack the responses. Is there a supported AMD graphics card based on the CDNA2/CDNA3 architectures on your devices? Flash attention is currently only supported on those devices see this issue.

@TNT3530, if running/building in the docker image works fine for you I suspect there is something wrong with how ROCm is configured on your host system. Would you be able to to try uninstalling the current version of docker and re-install the newest version of ROCm following this guide? Thanks!

ROCm / flash-attention

undefined symbol: hipGetDevicePropertiesR0600 #31