yangli18 / VLTVG

Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
91 stars 8 forks source link

Visualize the attention map for a point #4

Closed fawnliu closed 2 years ago

fawnliu commented 2 years ago

Thank you for your great work.
Could you tell me how to visualize the attention map for a point in Fig.4, or share its code?

0
yangli18 commented 2 years ago

@scarleatt Hi! During inference, the context encoder calculates a language-guided self-attention map for the input image, which is of the shape HWxHW. You can first cache the entire self-attention map and then obtain the attention map for any point from this self-attention map by its location. Reshape the obtained attention map to HxW and you can visualize it.

fawnliu commented 2 years ago

Thank you very much for the quick reply, I'll have a try.