fudan-zvg / SETR

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
MIT License
1.05k stars 149 forks source link

Question about Figure 8 in paper #31

Closed wqhIris closed 2 years ago

wqhIris commented 3 years ago

Thank you for your nice work! I am wondering how to get the attention map of the picked point, could you give a simple introduction? image

sixiaozheng commented 2 years ago

You can just take the certain row of the self-attention matrix softmax(QK) and get an HWx1 matrix. Then reshaping it into HxW will be ok. For example, if you want to take the central pixel's attention map, you can choose the (HW/2)th row of the self-attention matrix softmax(QK).

wqhIris commented 2 years ago

Got it, thank you very much!