cmsflash / efficient-attention

An implementation of the efficient attention module.
https://arxiv.org/abs/1812.01243
MIT License
269 stars 26 forks source link

efficient-attention applly in Cross Attention #13

Open stanny880913 opened 1 month ago

stanny880913 commented 1 month ago

Hello, I have recently implemented a cross attention application with multi-modal fusion, but because the image resolution is too large, cuda OOM occurs when calculating q and k, so I found your paper and hope to use it to reduce the consumption of computing resources. May I ask? Can your concept be applied to cross attention? Is it equivalent to calculating k and v of input2 in advance, and then using a matrix to calculate qw of input1? thank you

cmsflash commented 1 month ago

Yes, efficient attention does apply to cross-attention. And yes, your understanding is correct.