fastai / courses

fast.ai Courses
Apache License 2.0
5.61k stars 2.74k forks source link

Batch Normalization - Vgg fine tune model - very low accuracy #225

Open HRKpython opened 6 years ago

HRKpython commented 6 years ago

I created the Vgg model using keras with tensorflow backend and theano image ordering. For fully connected layers, I added the batch normalization after the relu activation layers. I loaded the vgg16_bn.h5 weights of the model. Then I pop the last 8 layers up to the maxpooling laye in conv5 block "model.add(MaxPooling2D((2,2), strides=(2,2)))".

model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))

I pre-compute the features up to there and save them. Then I created a top_model based on my problem and load the precompute features to the top_model and trained the model. I got 88% accuracy after 20 epochs on validation set and 99% accuracy on the training set. Then I saved the weights for the top model to a file.

Now, I want to fine tune the model (train the last conv block as well). So I have the vgg model with batch normalization only on fully connected layers, I loadded the vgg_bn weights and poped the last 8 layers. Then created my top_model and load the weights based on the previous study and add my model to the vgg_bn model without last 8 layers. Then for the first 25 layers, I setup the trainable to be False.

for layer in model.layers[:25]: layer.trainable = False

Without doing any training, If I evaluate my validation_generator, I get 88% accuracy. Now I let the model to be trained for 1 epoch and my accuracy is 20% on training and 36% on validation. Even if I train for 40 epochs it does not help. I appreciate any help.

Evan05543071 commented 5 years ago

https://github.com/keras-team/keras/pull/9965#issuecomment-382801648