Question: Why do you freeze batch norm layers?

toandaominh1997 / EfficientDet.Pytorch

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

MIT License

1.44k stars 305 forks source link

Question: Why do you freeze batch norm layers? #33

Closed adizhol closed 4 years ago

adizhol commented 4 years ago

Hello, First, thank you for the repo!

in EfficientDet.init line 55 [...] self.freeze_bn()

If I want to retrain on custom data. I'd want to retrain the entire net, no?

Thanks!

deweyamer commented 4 years ago

In my option. it depends on your batch size. if your batch size is small, you'd better to freeze them you can reference RetinaNet's methods

turner-rovco commented 4 years ago

@adizhol You might be confusing freezing batchnorm layers with freezing backbone layers. Calling self.freeze_bn() does NOT freeze the whole backbone, you'll still be training the entire net.

lolongcovas commented 4 years ago

in my fork I added freeze_backbone and freeze_bn options so that you can first freeze the backbone in order to adjust the retina heads weights. After that you can retrain the whole network by freezing the bn layers. Adding freezing bn layers avoids the bn stats numbers for small batch size.