lucidrains / taylor-series-linear-attention

Explorations into the recently proposed Taylor Series Linear Attention
MIT License
89 stars 3 forks source link

[Feature request] Self-attention with Persistent Memory #4

Open MarcusLoppe opened 4 months ago

MarcusLoppe commented 4 months ago

I've had great luck using it in the x-transformers decoder layer and I think it would be a great addition to linear attention.

Let me know if I can help!