Open elkay opened 8 months ago
Actually, just noticed this PR.
https://github.com/Dao-AILab/flash-attention/pull/757
Seems like exactly what I'm looking for. Hopefully the PR can be approved.
Does either #757 or #724 work for you?
Does either #757 or #724 work for you?
757 did end up working for me.
I want to use this on a newer AWS Graviton server but anything I've done to try to get it to compile from sources has completely hung the process until I abort. No error output. I'm used to longer compile times and don't believe this is the issue. CPU usage hovers around 1% so it's not doing anything. Not even sure how to go about debugging it.