TopFormer implementation differs from original reference implementation

sig_act is computed differently from the original implementation. Complare https://github.com/BR-IDL/PaddleViT/blob/55b33c3d11c16f7fe5069cbd85962a68c4867ded/semantic_segmentation/src/models/backbones/top_transformer.py#L330-L334

with

https://github.com/hustvl/TopFormer/blob/2dc253c49ef78742ca6b44e550c5fea63a274288/mmseg/models/backbones/topformer.py#L328

I assume this is not intentional. The fix is straightforward:

    def forward(self, x_local, x_global):
        '''
        x_g: global features
        x_l: local features
        '''
        B, C, H, W = x_local.shape
        local_feat = self.local_embedding(x_local)

        global_act = self.global_act(x_global)
        sig_act = F.interpolate(self.act(global_act), size=(H, W), mode='bilinear', align_corners=False)

        global_feat = self.global_embedding(x_global)
        global_feat = F.interpolate(global_feat, size=(H, W), mode='bilinear', align_corners=False)

        out = local_feat * sig_act + global_feat
        return out

BR-IDL / PaddleViT

TopFormer implementation differs from original reference implementation #231