Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
14.08k stars 1.31k forks source link

Has anyone successfully compiled this on ARM Linux (aarch64)? #879

Open elkay opened 8 months ago

elkay commented 8 months ago

I want to use this on a newer AWS Graviton server but anything I've done to try to get it to compile from sources has completely hung the process until I abort. No error output. I'm used to longer compile times and don't believe this is the issue. CPU usage hovers around 1% so it's not doing anything. Not even sure how to go about debugging it.

elkay commented 8 months ago

Actually, just noticed this PR.

https://github.com/Dao-AILab/flash-attention/pull/757

Seems like exactly what I'm looking for. Hopefully the PR can be approved.

tridao commented 8 months ago

Does either #757 or #724 work for you?

elkay commented 8 months ago

Does either #757 or #724 work for you?

757 did end up working for me.