tusen-ai / MV2D

Code for "Object as Query: Lifting any 2D Object Detector to 3D Detection"
91 stars 10 forks source link

question about sparse cross attention module #14

Open ShengYu724 opened 9 months ago

ShengYu724 commented 9 months ago

Thank you for your great work!

In the article, the sparse cross attention is proposed. But i don't understand how do you implement this module. It seems that you don't split the query to N groups and do n times attention operations. I think you may use cross-attn-mask and key-padding-mask to make the query attention with the selected object features. Is that true?

But using attn-mask still remains unacceptable computational complexity in attention operation and each query should calculate with all obj features. I wonder if you using a different implement or my guess is right.