Closed MaureenZOU closed 3 years ago
hello, have you generated the attention map like fig. 1? @MaureenZOU
The problem was solved by the explanation in Section 3.4, paragraph comparison to detr. instead of measuring the similarity with memory + pos_encoding, the author just measuring the similarity between the position encoding.
The problem was solved by the explanation in Section 3.4, paragraph comparison to detr. instead of measuring the similarity with memory + pos_encoding, the author just measuring the similarity between the position encoding.
@MaureenZOU Could you please kindly provide the souce code for visualizing the attention map? That will be greatly helpful. Thanks a lot.
Hi, @GWwangshuo @MaureenZOU @SISTMrL,
Thank you for your attention. Sorry for the late reply. We did not release the visualization code yet since we find that it is not easy to write a neat and clean version of it. When we finished re-writing this part of code, we will make a release (there is no certain schedule yet, the authors are busy working on recent ddls).
Here is a brief guide:
The problem was solved by the explanation in Section 3.4, paragraph comparison to detr. instead of measuring the similarity with memory + pos_encoding, the author just measuring the similarity between the position encoding.
Hello, when I tried to visualize detr, I first read the self-attn of the last layer of decoder to get cq:[100,1,256]; In addition, pQ is read from the trained model: [100,256]; Then get the pk of the feature map: [1,256,h, W]; Then calculate ((cq + pq)T * pk).softmax(-1).view(h,w) found out the effect is inconsistent. I really hope to get yours reply.
Hi Author,
First thanks for your great work to improve the convergence speed of DETR with such a large margin. When reading the paper, I get a little bit confused on how do you exactly draw the attention map in Figure 1.
Given object query q (1 x d), memory feature m (d x (hw)). I use the following equation to draw the attention maps:
Similarity(q,m) = Softmax(proj(q) \dot proj(m)) [1 x (hw)] where proj is the trained linear layer in cross attention module.
The attention maps I get is quite similar with the one shown in DETR paper:
A random object query:
A random object query on head A:
A random object query on head B:
A random object query on head C:
Could you please give some information on how to generate attention in Figure 1? Thanks!
你好,请问有研究过如何可视化Deformable-detr的注意力权重吗,我基于DETR提供的绘制代码一直不能得到正确结果~
Hi Author,
First thanks for your great work to improve the convergence speed of DETR with such a large margin. When reading the paper, I get a little bit confused on how do you exactly draw the attention map in Figure 1.
Given object query q (1 x d), memory feature m (d x (hw)). I use the following equation to draw the attention maps:
Similarity(q,m) = Softmax(proj(q) \dot proj(m)) [1 x (hw)] where proj is the trained linear layer in cross attention module.
The attention maps I get is quite similar with the one shown in DETR paper:
A random object query:
A random object query on head A:
A random object query on head B:
A random object query on head C:
Could you please give some information on how to generate attention in Figure 1? Thanks!