digantamisra98 / Mish

Official Repository for "Mish: A Self Regularized Non-Monotonic Neural Activation Function" [BMVC 2020]
https://www.bmvc2020-conference.com/assets/papers/0928.pdf
MIT License
1.29k stars 129 forks source link

The result is not good (Fixed, improved mAP) #19

Closed DrewdropLife closed 4 years ago

DrewdropLife commented 4 years ago

Hi, I tried to use mish instead of relu on mask r-cnn, including res-net and three head, but the AP dropped 0.3 in the end. What is the possible problem? Thank you!

digantamisra98 commented 4 years ago

@changxinC Hi. Thanks for raising the issue. Can you please share the code to reproduce the results?

DrewdropLife commented 4 years ago

Thank you for your reply!

class Mish(nn.Module):
    def __init__(self):
        super(Mish,self).__init__()
    def forward(self,x):
        return x*torch.tanh(F.softplus(x))

def mish(x):
    M = Mish()
    return M(x)

I used the above code to replace all F.relu_() and leaklyrelu(0.2, inplace = True).

digantamisra98 commented 4 years ago

@changxinC can you please post the complete program along with the data loaders and architecture.

DrewdropLife commented 4 years ago

I had found the problem, it might be that I used a pretrained backbone network using relu, and then replace the relu with mish, instead of initializing the backbone network after replacing the mish. Now I haven't modified the backbone network, only changed the relu in other places to mish, and the AP is improved by 0.3. I'd like to ask if you have a resnet-50 pretrained model using mish. Thank you very much!

digantamisra98 commented 4 years ago

@changxinC Ah, got it. Thanks for the response. Glad to know that the AP improved. I am working on ImageNet training as of now so soon will upload a ResNet-50 pretrained model. Will take some time though. Closing this issue since it seems to be resolved.