broadinstitute / keras-resnet

Keras package for deep residual networks
Other
300 stars 127 forks source link

Training ResNet50 on ImageNet does not achieve reasonable performance #46

Closed Callidior closed 5 years ago

Callidior commented 5 years ago

I previously had no problems training the ResNet50 implementation bundled with Keras in keras.applications.resnet50 to 70% validation accuracy on the ILSVRC 2012 dataset.

Now I wanted to switch to the keras_resnet implementation, but was not able to get validation accuracy above 30%. Right after the first epoch, accuracy of keras_resnet is about 1%, while the bundled ResNet50 already achieves 11%.

I am creating the ResNet like this:

input_ = keras.layers.Input((3, None, None)) if K.image_data_format() == 'channels_first' else 
keras.layers.Input((None, None, 3))
rn = keras_resnet.models.ResNet50(input_, include_top = True, classes = 1000, freeze_bn = False)

I've already tried different learning rate schedules and optimizers, but nothing worked.

Is there anything special I have to take care of?

Callidior commented 5 years ago

I discovered that the problem was due to the implementation of the freezable BatchNormalization layer. If initialized with freeze=False, it is not equivalent to the standard implementation, but always forced to training mode.

I propose to fix this in PR #47.

Callidior commented 5 years ago

Fixed in #47.