Open Bhagyashreet20 opened 5 months ago
I thought it's your cuda lib is not included.
Try export LD_LIBRARY_PATH=/usr/lib/cuda/lib:$LD_LIBRARY_PATH
or something like this.
Despite using the nvidia-containers with cuda 12.4 and compiling from source, i still run into below error
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/scratch.btaleka_gpu_1/code/flash-attention/flash_attn/__init__.py", line 3, in <module> from flash_attn.flash_attn_interface import ( File "/home/scratch.btaleka_gpu_1/code/flash-attention/flash_attn/flash_attn_interface.py", line 10, in <module> import flash_attn_2_cuda as flash_attn_cuda ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
Flash attention should consider upgrading to latest container stack without explicit dependency on particular cuda runtime version. Such dependencies are fragile and often breaks the pipeline once someone tries to upgrade.
Fixing such as reported in #208 or #728 are not correct solutions especially when compilation from source fails.
Hi, did you find the solution?
https://github.com/Dao-AILab/flash-attention/releases check for cuda12.3 release
Despite using the nvidia-containers with cuda 12.4 and compiling from source, i still run into below error
Flash attention should consider upgrading to latest container stack without explicit dependency on particular cuda runtime version. Such dependencies are fragile and often breaks the pipeline once someone tries to upgrade.
Fixing such as reported in https://github.com/Dao-AILab/flash-attention/issues/208 or https://github.com/Dao-AILab/flash-attention/issues/728 are not correct solutions especially when compilation from source fails.