Open panda1949 opened 4 years ago
We use gaussian initialization with std=0.01. I simply replace relu with frelu and it shows a slight improvement (0.1~0.3). We note that MobileNetV3 is a NAS-searched optimal CNN architecture, once you change the architecture (frelu has an additional dw-conv), you might need to search again on this new architecture to achieve the optimal result.
Thanks for your quick reply. I'll try your suggestions.
I reimplemented FReLU in PyTorch, and apply it on MobileNetV3 by replacing all the hswish with frelu. The ImageNet accuracy is as follow:
My code:
Am I missing something important? As for the gaussian initialization in FReLU, what's the std?