My CUDA version is 11.2, so I can't install Flash Attention on my machine. I try to set use_flash_attn as False when executing fine-tune.py, I meet this error be like:
Traceback (most recent call last):
File "/mnt1/dataln1/xxx/repo/LongLoRA/fine-tune.py", line 26, in
from llama_attn_replace import replace_llama_attn
File "/mnt1/dataln1/xxx/repo/LongLoRA/llama_attn_replace.py", line 10, in
from flash_attn import version as flash_attn_version
ModuleNotFoundError: No module named 'flash_attn'
Looking forward to your answer:(
My CUDA version is 11.2, so I can't install Flash Attention on my machine. I try to set use_flash_attn as False when executing fine-tune.py, I meet this error be like: Traceback (most recent call last): File "/mnt1/dataln1/xxx/repo/LongLoRA/fine-tune.py", line 26, in
from llama_attn_replace import replace_llama_attn
File "/mnt1/dataln1/xxx/repo/LongLoRA/llama_attn_replace.py", line 10, in
from flash_attn import version as flash_attn_version
ModuleNotFoundError: No module named 'flash_attn'
Looking forward to your answer:(