The difference between the code and the legend in the paper

Visual-Attention-Network / SegNeXt

Official Pytorch implementations for "SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation" (NeurIPS 2022)

Apache License 2.0

795 stars 85 forks source link

I noticed that in the code.In the MSCASpatialAttention,There is the following code def forward(self, x): """Forward function."""

    shorcut = x.clone()
    x = self.proj_1(x)
    x = self.activation(x)
    x = self.spatial_gating_unit(x)
    x = self.proj_2(x)
    x = x + shorcut
    return x

as we can see，when the X go through the 1x1 convolution、activation（GELU）、MSCAAttention and another 1x1 convolution，X add its shortcut！

But in the legend of the Attention( corresponding to the MSCASpatialAttention part in the code)，it doesn‘t draw this operation similar to residual linking 1732698824894

So I would like to ask, is the operation here not shown in the diagram

Visual-Attention-Network / SegNeXt

The difference between the code and the legend in the paper #71