FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.

kijai / ComfyUI-Florence2

Inference Microsoft Florence2 VLM

MIT License

300 stars 17 forks source link

FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2. #17

Open alenhrp opened 1 week ago

alenhrp commented 1 week ago

What is the reason and how to solve it? Thank you

kijai commented 1 week ago

It's not installed. I wouldn't worry about it when it comes to Florence2 though, I don't notice any performance difference to "sdpa" myself.

nero-dv commented 1 week ago

I have compiled a whl of flash-attn for Windows, python 3.11.8 + PyTorch 2.3.0+cu121 in case you still want to try it with flash attention 2 for some reason.