Question on how to train BN stably

sanghoon / pva-faster-rcnn

Demo code for PVANet

https://arxiv.org/abs/1611.08588

Other

651 stars 241 forks source link

Question on how to train BN stably #2

Closed hengck23 closed 7 years ago

hengck23 commented 7 years ago

Hi,

The original faster RCNN train with batch size=1 or 2. But for BN (batch normalization) layer, we need sufficient statistic to correctly estimate the mean and variance parameters. How did you solve this problem?

Thank you very much.

sanghoon commented 7 years ago

Hi @hengck23,

We've updated BN layers only in the pre-training step. In Faster R-CNN training, BN layers work just like scale-bias layers with fixed params.

hengck23 commented 7 years ago

@sanghoon That is very helpful. Thanks!

hengck23 commented 7 years ago

@sanghoon

there is a new paper on batch norm for small batch size (as small as batch=4). This may help. https://arxiv.org/abs/1702.03275

Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models Sergey Ioffe (Submitted on 10 Feb 2017)