chengchingwen / Transformers.jl

Julia Implementation of Transformer models
MIT License
514 stars 71 forks source link

Feature Request: Memory-Efficient Attention #79

Open ToucheSir opened 2 years ago

ToucheSir commented 2 years ago

From https://arxiv.org/abs/2112.05682v2. I have no immediate use for this, but it looks cool and I didn't want it to go unmentioned in case some aspiring contributor to Transformers.jl is looking for a project :)

ToucheSir commented 2 years ago

Upon finding NeuralAttentionlib, perhaps this would be better discussed there? Feel free to move the issue :)

chengchingwen commented 2 years ago

I'll keep the issue here for now because we get more attention here (pun intended)