Hello
Why we cant merge (gen_merged_model.py) all batchNorm on training -- all BN are constant
batch_norm_param {use_global_stats: true}
and some Scale on first 2 blocks (until conv3 block)
Additional questions:
-- Training are better with lr_mult: 0.1 on feature extraction?
-- Why BN on FC layers constant too ( use_global_stats: true ) maybe its better to adapt batch (256) with BN?
Hello Why we cant merge (gen_merged_model.py) all batchNorm on training -- all BN are constant batch_norm_param {use_global_stats: true} and some Scale on first 2 blocks (until conv3 block)
Additional questions: -- Training are better with lr_mult: 0.1 on feature extraction? -- Why BN on FC layers constant too ( use_global_stats: true ) maybe its better to adapt batch (256) with BN?