hello,
in the _update_bn_statsgpu function,
workspace.FeedBlob(
'gpu{}/'.format(i) + bn_layer + '_bn_rm',
np.array(self._meanX_dict[bn_layer], dtype=np.float32),
meanX of 200 batch_size num_gpu training samples is computed, then rewrite the mem of bn_layer + '_bn_rm'.
so why not use the running mean accumulated during training?
why the mean computed during COMPUTE_PRECISE_BN switch is more precise?
hello, in the _update_bn_statsgpu function, workspace.FeedBlob( 'gpu{}/'.format(i) + bn_layer + '_bn_rm', np.array(self._meanX_dict[bn_layer], dtype=np.float32),
meanX of 200 batch_size num_gpu training samples is computed, then rewrite the mem of bn_layer + '_bn_rm'. so why not use the running mean accumulated during training?
why the mean computed during COMPUTE_PRECISE_BN switch is more precise?