Scale and BN on training

Hello Why we cant merge (gen_merged_model.py) all batchNorm on training -- all BN are constant batch_norm_param {use_global_stats: true} and some Scale on first 2 blocks (until conv3 block)

Additional questions: -- Training are better with lr_mult: 0.1 on feature extraction? -- Why BN on FC layers constant too ( use_global_stats: true ) maybe its better to adapt batch (256) with BN?

sanghoon / pva-faster-rcnn

Scale and BN on training #88