-
I tried to imitate your educational coding style hehe
Here's a pure Pytorch implementation of Flash Attention, hope you like it @karpathy
```
def flash_attention(Q, K, V, is_causal=True, BLOCK_S…
-
# Table of Contents
1. [Intuition and underlying principles ](#Intuitionandunderlyingprinciples)
2. [Original Paper](#Paperreading)
3. [Implementation details](#Implementation-details)
4. [Fourth …
-
hi all, I saw this tweet and thought of sharing it. The accuracy degration doesnt look too good, but maybe the speed makes it worth it?
https://x.com/papers_anon/status/1839131401322639805?s=46
…
-
### Is your feature request related to a problem? Please describe.
The current implementation causes issues when loading old model checkpoints during inference as it is not clear whether flash attent…
-
I install the flash-attention and follow this link: https://rocm.blogs.amd.com/artificial-intelligence/flash-attention/README.html
my GPU is gtx1100(7900XTX)
I install it in the docker and the d…
-
Any plans on upgrading this repo for v2 of [flash-attention](https://github.com/Dao-AILab/flash-attention)?
-
Flash Attention is still listed in the documentation:
https://opennmt.net/CTranslate2/python/ctranslate2.Generator.html
I'd recommend keeping it since you do not have a history of documentation …
-
While I know it's not currently listed in the build instructions, I'm curious if there's been any success in getting Flash Attention to work. Some time ago I was able to build it successfully, but cou…
-
When I try to train a stripedhyena model I keep getting issues with the stripedhyena modules seemingly trying to import modules from Flash Attention in an outdated way.
example:
AttributeError: mod…
-
What is the different between the flash attention and the fused attention