broadinstitute / keras-resnet

Keras package for deep residual networks
Other
300 stars 127 forks source link

Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Closed Callidior closed 5 years ago

Callidior commented 5 years ago

Previously, if BatchNormalization was initialized with BatchNormalization(freeze=False), its behaviour was not equivalent to the standard BatchNormalization layer, as one would expect. Instead, it was always forced to be in training mode, providing wrong validation results.

This PR does not change the behaviour for freeze=True, but makes the layer equivalent to the standard BatchNormalization layer from Keras for freeze=False.

hgaiser commented 5 years ago

Doesn't this only change the behaviour if freeze=True?

Also, what accuracy are you getting now?

Callidior commented 5 years ago

No, the behaviour for freeze=True is not changed. Previously, we called the method of the superclass with training=(not self.freeze), which would evaluate to training=False. Now, if self.freeze is True, we set training=False, as before.

If self.freeze is False, however, we now have training=None (the default) instead of training=True.

It's still training, but I now already have 12% validation accuracy after the first epoch and 40% after 4 epochs, which is already higher than anything I got without the modifications made in this PR.

Callidior commented 5 years ago

By the way, I would question the example in the README. The model is initialized there with freeze_bn=True (the default), which fixes the BatchNormalization layers to test using the initialization parameters. This should be equivalent to using no batch normalization at all.

I also tried this first for my ImageNet training, since the README does so, but it didn't work.

Callidior commented 5 years ago

I now finally obtained 68% validation accuracy, which is much closer to what I got with the bundled ResNet-50 than before.

0x00b1 commented 5 years ago

Awesome. Thanks, @Callidior.