SehwanChoi0307 / Mask2Map

Apache License 2.0
61 stars 1 forks source link

query and pos cat issue #11

Open ChiefGodMan opened 1 week ago

ChiefGodMan commented 1 week ago

In GeometricFeatureExtractor() function, the concatenation order is (query, pos), the code line is bellow: https://github.com/SehwanChoi0307/Mask2Map/blob/6281f3592503ecf74450d19a30b0e29889ce335f/projects/mmdet3d_plugin/mask2map/modules/transformer_2p.py#L590 . However, in MaskGuidedMapDecoder() function, the split order is (pos, query), the code line is bellow: https://github.com/SehwanChoi0307/Mask2Map/blob/6281f3592503ecf74450d19a30b0e29889ce335f/projects/mmdet3d_plugin/mask2map/modules/transformer_2p.py#L752 . Why?

SehwanChoi0307 commented 4 days ago

We believe that due to the feature blending caused by the gate function, the order of the positional and feature information at the beginning and end does not matter. Ultimately, as long as there is a consistent direction in the loss, the model can effectively utilize context and positional information.

ChiefGodMan commented 4 days ago

In your opinion, the value and function between query and query_pos are equivalent ?

The two tensors mask_aware_query_feat, mask_aware_query_pos are from PositionalQueryGenerator(): https://github.com/SehwanChoi0307/Mask2Map/blob/6281f3592503ecf74450d19a30b0e29889ce335f/projects/mmdet3d_plugin/mask2map/modules/transformer_2p.py#L524
image pts_query_feat and pts_query_pos are split from same embedding weight, but mask_aware_query_feat_pia and mask_aware_query_pos are far away from different modules. The former is transformed from multi layers Mask2FormerTransformerDecoderLayer, the latter is transformed from self.decoder_positional_encoding, then multiplied by sigmoid mask_pred value. https://github.com/SehwanChoi0307/Mask2Map/blob/6281f3592503ecf74450d19a30b0e29889ce335f/projects/mmdet3d_plugin/mask2map/modules/transformer_2p.py#L498

Finally, in MaskGuidedMapDecoder() function, we use the concatenated mask_aware_query += self.out_proj(query) . Yes, the original order of (query, query_pos) always keep. https://github.com/SehwanChoi0307/Mask2Map/blob/6281f3592503ecf74450d19a30b0e29889ce335f/projects/mmdet3d_plugin/mask2map/modules/transformer_2p.py#L749 image

Offcause if we change query = mask_aware_query + self.out_proj(attn_output + gate * (self.lin_self(mask_aware_query_norm) - attn_output)) to query = self.out_proj(attn_output + gate * (self.lin_self(mask_aware_query_norm) - attn_output)), then gate tensor may transform its orders.