megvii-model / FunnelAct

MIT License
175 stars 21 forks source link

Apply FReLU to MobileNetV3 #4

Open panda1949 opened 4 years ago

panda1949 commented 4 years ago

I reimplemented FReLU in PyTorch, and apply it on MobileNetV3 by replacing all the hswish with frelu. The ImageNet accuracy is as follow:

model top1
mobilenetv3+hswish 75.2%
mobilenetv3+frelu 74.8%

My code:

class FReLU(nn.Module):
    def __init__(self, in_channels, inplace: bool = False):
        super(FReLU, self).__init__()
        self.inplace = inplace
        self.conv_frelu = nn.Conv2d(in_channels, in_channels, 3, 1, 1, groups=in_channels, bias=False)
        self.bn_frelu = nn.BatchNorm2d(in_channels)

    def forward(self, x):
        x1 = self.conv_frelu(x)
        x1 = self.bn_frelu(x1)
        x = torch.max(x, x1)
        return x

Am I missing something important? As for the gaussian initialization in FReLU, what's the std?

nmaac commented 4 years ago

We use gaussian initialization with std=0.01. I simply replace relu with frelu and it shows a slight improvement (0.1~0.3). We note that MobileNetV3 is a NAS-searched optimal CNN architecture, once you change the architecture (frelu has an additional dw-conv), you might need to search again on this new architecture to achieve the optimal result.

panda1949 commented 4 years ago

Thanks for your quick reply. I'll try your suggestions.