Open amindehnavi opened 2 years ago
Have you figured out how to draw attention map of encoder and decoder?
Any update on this?
Any update on this?
Unfortunately no. as mentioned in the paper, the self-attention blocks are the same as DETR, which are instantiated from torch.nn.MultiheadAttention class but the cross-attention block is built on MSDeformAttentionFunction(), and I have not figured out how to reach the attention weights from that. If you find out how to do that, please tell us :)
the author provided a pytorch version for debuging purpose. just used function ms_deform_attn_core_pytorch in ./models/ops/functions/ms_deform_attn_func.py instead of cuda version. You can call this function as following( in class MSDeformAttn ):
# value, input_spatial_shapes, input_level_start_index, sampling_locations, attention_weights, self.im2col_step)
output = ms_deform_attn_core_pytorch(value, input_spatial_shapes, sampling_locations, attention_weights)
Just comment the cuda version MSDeformAttnFunction in class MSDeformAttn(nn.Module) and use pytorch version
Hi, is there any way to generate the output attention maps of
model.transformer.decoder.layers[i].cross_attn
layer? when I follow the referenced functions, I finally get stuck inMSDA.ms_deform_attn_forward
function in the forward method of theMSDeformAttnFunction
class which is located at./models/ops/functions/ms_deform_attn_func.py
file, and I couldn't find any argument to set True to get the attention map in output.