Open vadimkantorov opened 3 years ago
have you gotten results?
I've found an obscure repo with an example of this https://github.com/duongnv0499/Explain-Deformable-DETR/, but haven't tried it yet
Have you figure out how to draw this attention map correctly? Thanks @vadimkantorov
I've found an obscure repo with an example of this https://github.com/duongnv0499/Explain-Deformable-DETR/, but haven't tried it yet Have you figure out how to draw this attention map correctly? Thanks @vadimkantorov
Have you figure out how to draw this attention map correctly? Thanks
Waiting for an update! My approach was to create a plot of 0s according to feature map size, and then find the sampling locations + attention weights. And then per sampling location, add the corresponding attention weight into the plot of 0s so the more heavy it is the higher the value in the plot. (This is just my idea)
Hey, did you manage it in the end? I am trying to do it for a DINO which uses the attention mechanism swell :)
Have you figure out how to draw this attention map correctly? Thanks @vadimkantorov @GivanTsai
Hey, did you manage it in the end? I am trying to do it for a DINO which uses the attention mechanism swell :)
Can you do it? I'm also stuck on the Attention visualisation!
I did it like this, but it's kinda hacky: If you want to visualise it you'll need the reference points, where the model is attending to.
Simple hooks in the PyTorch model were not sufficient, as they only extracted the layer weights.
Therefore I modified the layer attention and pushed the locations into a global list (make sure to detach them from your gpu). This list, I then visualised. Depending on the model you'd also have to export the indizes of the top k points that were mapped to the actual detections. Otherwise, the attention is visualised for the wrong objects.
I hope this helps, feel free to ask tho.
I did it like this, but it's kinda hacky: If you want to visualise it you'll need the reference points, where the model is attending to.
Simple hooks in the PyTorch model were not sufficient, as they only extracted the layer weights.
Therefore I modified the layer attention and pushed the locations into a global list (make sure to detach them from your gpu). This list, I then visualised. Depending on the model you'd also have to export the indizes of the top k points that were mapped to the actual detections. Otherwise, the attention is visualised for the wrong objects.
I hope this helps, feel free to ask tho. Thank you for your reply. I also have some questions for you, I am using the dino detection model in the mmdetection framework, and I would like to use this to visualise it, I can now get the reference_points and sampling_locations, but it is a bit difficult to visualise this, what do you mean by the locations and indizes in your reply, and if possible, can you provide the code for your visualisation, thank you very much. What do you mean by locations and indizes in your reply, and if possible, could you provide me with the code for your visualisation, thank you very much!
If you look at dino architecture, you will find a top k query selection as the model output (at least in detrex) was ordered by relevance. This reordering is after the decoder. So your extracted points will be in a different order.
(https://arxiv.org/abs/2203.03605) Based on your confidence threshold you are probably going to output less than 20 or so.
Further, try to plot it separately first and then try to match colours with the respective objects. I can't provide you code, since I don't have repo access anymore and also I used detectron2.
Could you please publish the code for visualizations like in Figure 6 in the paper, if you still have this snippet?
Thank you!