facebookresearch / ConvNeXt

Code release for ConvNeXt model
MIT License
5.78k stars 696 forks source link

About BN replace LN #95

Open kings-rgb opened 2 years ago

kings-rgb commented 2 years ago

In normal Resnet, BN is better than LN, we think different dimensions computed is the reason. So, why LN is better than BN in ConvNext?

liuzhuang13 commented 2 years ago

Hi,

In our case, LN is only slightly better than BN on ImageNet-1K classifications, and this was based on an intermediate model, not the final one. We don't know why and we are not sure how the comparison would be on downstream tasks. This can be interesting future problems.

In ResNets, indeed BN seems better according to some prior work, but in my understanding that LN was also implemented differently from ours (we do per spatial location normalization). This could also be a factor