Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
14.18k stars 1.32k forks source link

No Module Named 'torch' #246

Open MilesQLi opened 1 year ago

MilesQLi commented 1 year ago

When I run pip install flash-attn, it says that. But obviously, it is wrong. See screenshot. image

jackbravo commented 1 month ago

anyone having problems with macbookpro m3?

@ManuelSokolov. Seeing the requirements of the README, it says that this project requires CUDA, so I don't think you can run it with an M3. But also, on the file usage.md it mentions a fork for apple silicon: https://github.com/philipturner/metal-flash-attention

Calandiel commented 1 month ago

The proposed solutions here are (imho) problematic. Using an nvidia container is a huge dependency and I personally find it difficult to trust a build with no isolation (to not mess up my existing setup) for a library that seems to have a messy installation process.

albertotono commented 5 days ago

Dear Professor @tridao and team,

I had the same issue with these

Screenshot 2024-11-08 at 3 31 46 PM

I have torch = 2.1 and CUDA 12.2, I am following this https://github.com/kuleshov-group/mdlm

In my case I solve add in the .tomldirectly the wheel

and also using

sudo poetry add "mamba-ssm @ https://github.com/state-spaces/mamba/releases/download/v2.2.2/mamba_ssm-2.2.2+cu122torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl"
sudo poetry add "causal-conv1d  @ https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.1.3.post1/causal_conv1d-1.1.3.post1+cu122torch2.1cxx11abiTRUE-cp310-cp310-linux_x86_64.whl"

I hope it helps for all the community.