claCase / Attention-as-RNN

Non-official implementation of "Attention as an RNN" from https://arxiv.org/pdf/2405.13956, efficient associative parallel prefix scan and recurrent version implemented.
MIT License
20 stars 0 forks source link