BatchNorm usage - Githubissues

isht7 / pytorch-deeplab-resnet

DeepLab resnet v2 model in pytorch

MIT License

602 stars 116 forks source link

BatchNorm usage #15

Closed Eniac-Xie closed 7 years ago

Eniac-Xie commented 7 years ago

Hi, the parameters of BatchNorm layer in resnet101 is fixed by here

But the running_mean and running_var is also need to be fix, so I think we need to set BatchNorm to eval mode, not just fix parameters (weight and bias)

isht7 commented 7 years ago

The model is trained in eval mode as can be seen here.

Eniac-Xie commented 7 years ago

Thank you for your reminder , so the code here is redundant?

isht7 commented 7 years ago

No, it is not redundant. model.eval() is for keeping running mean and running variance fixed. Setting requires_grad = False is for fixing the gamma and beta parameters in the batch-norm. Read more about batchnorm here

pkuCactus commented 6 years ago

Hi, i don't understand why fix the batchnormal parameters?

isht7 commented 6 years ago

This is a re-implementation of deeplab-resent. The original code fixed batchnorm parameters and therefore we also did that. They might have done this to reduce the number of trainable parameters because the number of training images is quite less. This could help prevent overfitting.