Closed Yangr116 closed 2 years ago
Hi @Yangr116 , this is a small mistake in the PyTorch GroupNorm API explanation. I have reported it to PyTorch. This issue is duplicated to issue #9 where I have given some explanation. Hope this can help you to understand it.
Duplicate of #9
Thanks for your quickly reply!
---Original--- From: "Weihao @.> Date: Thu, Dec 2, 2021 18:57 PM To: @.>; Cc: @.**@.>; Subject: Re: [sail-sg/poolformer] Some question about Layernorm and GroupNorm. (Issue #11)
Duplicate of #9
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Thanks for your good work in the CV area. I have some questions about GroupNorm and LayerNorm. The GroupNorm with group_num = 1 is equivalent to the LayerNorm. Why does GroupNorm outperform LayerNorm in your ablation study (Table. 6)?
A simple example from https://pytorch.org/docs/stable/generated/torch.nn.GroupNorm.html
input = torch.randn(20, 6, 10, 10) # Separate 6 channels into 3 groups m = nn.GroupNorm(3, 6) # Separate 6 channels into 6 groups (equivalent with InstanceNorm) m = nn.GroupNorm(6, 6) # Put all 6 channels into a single group (equivalent with LayerNorm) m = nn.GroupNorm(1, 6) # Activating the module output = m(input)
Looking forward to your reply!