claCase / Attention-as-RNN

Non-official implementation of "Attention as an RNN" from https://arxiv.org/pdf/2405.13956, efficient associative parallel prefix scan and recurrent version implemented.
MIT License
17 stars 0 forks source link

Merge Linear Attention RNN #4

Closed claCase closed 3 months ago