Open LiangXin1001 opened 10 months ago
Theoretically the CUDA runtime can have a larger minor version than the CUDA driver version. I am not sure if it works for flash-attention, but you can install CUDA runtime library in a local folder, and then try to reinstall flash-attention and pytorch.
sh cuda_11.x_linux.run
export PATH=/your/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/your/local/cuda/lib64:$LD_LIBRARY_PATH
If 11.6 does not work, also try 11.4 and flash-attn==2.0.4
as well.
Please also update here so that other users can benefit if this works, thanks.
Theoretically the CUDA runtime can have a larger minor version than the CUDA driver version. I am not sure if it works for flash-attention, but you can install CUDA runtime library in a local folder, and then try to reinstall flash-attention and pytorch.
sh cuda_11.x_linux.run export PATH=/your/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/your/local/cuda/lib64:$LD_LIBRARY_PATH
If 11.6 does not work, also try 11.4 and
flash-attn==2.0.4
as well.Please also update here so that other users can benefit if this works, thanks.
Thank you! I've since switched to a cuda version 11.8 server
Hello everyone,
I am currently working on a deep learning project using a server that only supports CUDA 11.0. The project relies on flash-attn, which requires CUDA >=11.6, and unfortunately, I do not have the permission to upgrade CUDA on the server.
I am seeking advice on how to modify the project to remove the dependency on flash-attn. Here are some specific questions I have:
Alternative Libraries: Are there any alternative libraries compatible with CUDA 11.0 that can replace flash-attn without significantly impacting the project's performance?
Code Modifications: If I remove flash-attn, what are the key areas in the code that would need modification? I am particularly concerned about how this might affect the model's training and inference performance.
Impact Assessment: What potential impacts should I anticipate in terms of model accuracy, training time, and resource utilization after removing flash-attn?
Thank you in advance for any guidance or suggestions you can provide.