IDEA-Research / detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
https://detrex.readthedocs.io/en/latest/
Apache License 2.0
2.01k stars 209 forks source link

Difference of MaskDINO implementation between `detrex` and `MaskDINO` repo #172

Closed YanShuang17 closed 1 year ago

YanShuang17 commented 1 year ago

Location: xxx/modeling/pixel_decoder/maskdino_encoder.py, in forward function of MSDeformAttnTransformerEncoderLayer:

MaskDINO implementation:

 def forward(self, src, pos, reference_points, spatial_shapes, level_start_index, padding_mask=None):
        # self attention
        src2 = self.self_attn(self.with_pos_embed(src, pos), reference_points, src, spatial_shapes, level_start_index, padding_mask)
        src = src + self.dropout1(src2)
        src = self.norm1(src)

        # ffn
        src = self.forward_ffn(src)

        return src

detrex implementation(residual is lost):


def forward(self, src, pos, reference_points, spatial_shapes, level_start_index, padding_mask=None):
        # self attention
        src2 = self.self_attn(query=src,query_pos=pos, reference_points=reference_points, value=src,spatial_shapes= spatial_shapes, level_start_index=level_start_index, key_padding_mask=padding_mask)
        src =src2
        src = self.norm1(src)

        # ffn
        src = self.forward_ffn(src)

        return src

@HaoZhang534

HaoZhang534 commented 1 year ago

Hi @YanShuang17 , this difference is caused by the different implementations of deformable attention in the original MaskDINO repo and detrex. In detrex, the deformable attention code contains adding positional queries and residual connection.

YanShuang17 commented 1 year ago

OK. got it, thx