batch normalization - Githubissues

ziqipang commented 5 years ago

Thanks for the excellent implementation! But I have some questions on the batch normalization.

In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

BMG-JTIAN commented 5 years ago

Thanks for the excellent implementation! But I have some questions on the batch normalization.

In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

Please correct me if I'm wrong. BatchNorm layers are not trained during the training session. Based on the test results, if you train BatchNorm layers with a small batch size, it can be harmful. The suggested batch size for batch normalization is 32 (you can check the paper "Bag of tricks in image classification"). Due to the size of images in COCO dataset, the common batch size in mask rcnn is 1 or 2. So, batch normalization layers will not be trained and only use pre-trained weights.

ziqipang commented 5 years ago

Thanks for the excellent implementation! But I have some questions on the batch normalization. In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

Please correct me if I'm wrong. BatchNorm layers are not trained during the training session. Based on the test results, if you train BatchNorm layers with a small batch size, it can be harmful. The suggested batch size for batch normalization is 32 (you can check the paper "Bag of tricks in image classification"). Due to the size of images in COCO dataset, the common batch size in mask rcnn is 1 or 2. So, batch normalization layers will not be trained and only use pre-trained weights.

Thanks! I got it.

vincentyw95 commented 4 years ago

Thanks for the excellent implementation! But I have some questions on the batch normalization. In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

Please correct me if I'm wrong. BatchNorm layers are not trained during the training session. Based on the test results, if you train BatchNorm layers with a small batch size, it can be harmful. The suggested batch size for batch normalization is 32 (you can check the paper "Bag of tricks in image classification"). Due to the size of images in COCO dataset, the common batch size in mask rcnn is 1 or 2. So, batch normalization layers will not be trained and only use pre-trained weights.

According to the initialize_weights() function, it seems tht batch normalization have no effects. Why not just remove all the batch normalization?

evinpinar commented 4 years ago

@vincentyw95 I guess they are there to enable using pretrained resnet model, whose conv layers are learn along with batchnorm. If these weights are loaded without bn, the learned convolutions would perform suboptimal?

multimodallearning / pytorch-mask-rcnn

batch normalization #89