hellochick / ICNet-tensorflow

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".
405 stars 153 forks source link

Meaning behind --update-mean-var --train-beta-gamma #40

Closed Tamme closed 5 years ago

Tamme commented 6 years ago

Hi.

I havent encountered in other projects this kind of updating values. Is it something originating from PSPNet or this really is the way to use momentum or is it something third?

Thanks, Tamme

hellochick commented 6 years ago

Hey @Tamme, let me explain the batch normalization layer first. There are four variables in the batch normalization layer, and they are moving_mean, moving_variance, gamma, and beta respectively. And moving_mean and moving variance are not trainable variables, so we need to update them using update ops which is put in tf.GraphKeys.UPDATE_OPS, you can take a look at tensorflow docs. So I use the flag --update-mean-var to decide whether to update mean and var ( Since update them in large batch size is better, if we train in mini-batch, we can frozen these two variables for better reuslts).

manuel-88 commented 6 years ago

hey, when I run the training without -update-mean-var the evaluation results are almost zero. Do you know why @hellochick ?

hellochick commented 6 years ago

@manuel-88, if you didn't update mean and variance, then the batch normalization layer will do nothing. Maybe this is the problem.