Closed lzhaozi closed 8 years ago
@lzhaozi
Sorry for late response.
I just follwed the description of the batch normalization paper.
As far as I remember, it described that the mean subtraction of batch norm layer could be considered as bias. Thus, we don't need to apply bias twice at each layer (?)
Thank you very much for your help.
I've gone back to check the paper today, and found it does describe as you said. I didn't notice it before.
@lzhaozi
Great!
Jaehyun
Hi,
I'm very thankful for your batch normalization implementation of GoogleNet, but as a beginner there is something I can't understand.
I see the GoogleNet implementation of BVLC/Caffe uses bias filler with constant 0.2 in every convolution layer but you drop out the bias term. Why is it necessary to drop the bias term? Is it because using bias filler will cause worse result in batch normalization condition?
I'm very thankful for your help.