jasperzhong / read-papers-and-code

My paper/code reading notes in Chinese
45 stars 3 forks source link

ECCV '18 | Group Normalization #72

Closed jasperzhong closed 4 years ago

jasperzhong commented 4 years ago

https://arxiv.org/abs/1803.08494

from: zihao

jasperzhong commented 4 years ago

image 图不错,收了

jasperzhong commented 4 years ago

看了半天怎么觉得和自己理解的BN LN不太一样,原来BN对CNN有特殊的修改。。。

详见https://stackoverflow.com/a/46692217/9601110

本来是对B一个维度算个mean/var

但是作者希望不同位置的element能够以相同的mean/var做normalize,这样比较符合CNN的性质

For convolutional layers, we additionally want the normalization to obey the convolutional property – so that different elements of the same feature map, at different locations, are normalized in the same way. To achieve this, we jointly normalize all the activations in a mini- batch, over all locations.

CNN要对BHW算个mean/var,把BHW合起来算一个effective mini-batch

这样算出来其实只有C个mean/var,而不是CHW个

nb

jasperzhong commented 4 years ago

LN也一样,对CHW算mean/var,这样有B个mean/var,把CHW合起来算effective hidden size

IN对HW算mean/var,这样有CB个mean/var,

GN也是对(C/G)HW算mean/var,一共CB/G个mean/var. G=1的话就是LN,G=C就是instance norm

实现也很简单

image