ECCV '18 | Group Normalization

jasperzhong commented 4 years ago

https://arxiv.org/abs/1803.08494

from: zihao

jasperzhong commented 4 years ago

图不错，收了

jasperzhong commented 4 years ago

看了半天怎么觉得和自己理解的BN LN不太一样，原来BN对CNN有特殊的修改。。。

详见https://stackoverflow.com/a/46692217/9601110

本来是对B一个维度算个mean/var

但是作者希望不同位置的element能够以相同的mean/var做normalize，这样比较符合CNN的性质

For convolutional layers, we additionally want the normalization to obey the convolutional property – so that different elements of the same feature map, at different locations, are normalized in the same way. To achieve this, we jointly normalize all the activations in a mini- batch, over all locations.

CNN要对BHW算个mean/var，把BHW合起来算一个effective mini-batch

这样算出来其实只有C个mean/var，而不是CHW个

nb

jasperzhong commented 4 years ago

LN也一样，对CHW算mean/var，这样有B个mean/var，把CHW合起来算effective hidden size

IN对HW算mean/var，这样有CB个mean/var，

GN也是对(C/G)HW算mean/var，一共CB/G个mean/var. G=1的话就是LN，G=C就是instance norm

实现也很简单

jasperzhong / read-papers-and-code

ECCV '18 | Group Normalization #72