Open kishida opened 5 months ago
Why is it not found?
I'm building on Windows with CUDA 11.8 or 12.1 in Developer Command Prompt for VS 2022.
but sorry, to reproduce the error I tryed to install again on new venv with CUDA 11.8, but the installation succeeded although it seems not work well.
anywhere cache is left?
loading Phi3 vision gets the error. (with 12.1, it works well)
File "D:\dev\llm\cu118\Lib\site-packages\transformers\modeling_utils.py", line 1571, in _check_and_enable_flash_attn_2
raise ImportError(f"{preface} the package flash_attn seems to be not installed. {install_message}")
ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.
in setup.py,
urllib.request.urlretrieve(wheel_url, wheel_filename)
try to load wheel but the url starts withhttps://github.com/Dao-AILab/flash-attention/releases/download/
that is not found. This is causing other error. https://github.com/Dao-AILab/flash-attention/blob/main/setup.py#L271