ZrrSkywalker / MonoDETR

[ICCV 2023] The first DETR model for monocular 3D object detection with depth-guided transformer
327 stars 31 forks source link

Visualizations of attention maps in depth cross-attention #32

Open yangfan293 opened 1 year ago

yangfan293 commented 1 year ago

Hello, may I ask if the visualization in Figure 5 is directly output and drawn by attn_output_weights.sum(dim=1)/num_heads of depth cross-attention layer? Why is the picture drawn by my trained model very different from yours?