apple / ml-cvnets

CVNets: A library for training computer vision networks
https://apple.github.io/ml-cvnets
Other
1.76k stars 225 forks source link

Cross Attention Computation in LinearSelfAttention() #81

Open goutamyg opened 1 year ago

goutamyg commented 1 year ago

Hi,

I have a question regarding the computation of cross-attention in https://github.com/apple/ml-cvnets/blob/main/cvnets/layers/linear_attention.py#L163

Here the Query and Key are generated from the input _xprev, and the Value is generated from the input x. However, in general, Query is generated from one of the inputs, and the other input is used to generate the Key and Value, for example: https://vaclavkosar.com/images/cross-attention-in-transformer-architecture.png

Can you please help me understand the idea behind your implementation of cross-attention?