Closed jrao1 closed 7 years ago
Hi, yes you're right. As we report in the paper, we observed better performances when using batch statistics at inference time rather than moving averages. That's why we disable the computation of the moving average in this code.
Is there a reason batch_norm_update_averages=False and batch_norm_use_averages=False are used in train.py and test.py? It looks to me that this is not the normal way we use BN.