Closed chaohuang closed 3 years ago
@chaohuang What do you expect from step 1 if you continue to train the full network afterwards?
You may omit step 1 and train the full network by unfreezing all layers:
# make all layers trainable
for layer in base_model.layers:
layer.trainable = True
# add your head on top
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(num_classes, activation='softmax')(x)
model = Model(base_model.input, predictions)
Don't forget to compile your model!
@AZweifels The reason for step 1 is the same as the one in the Keras document, where the newly added top layers were trained first before training the whole network.
Although I'm not 100% sure about the rationale, I guess is that the weights in top layers are randomly initialized, while the weights in the base model (convoluional layers) are already pre-trained, so we train top layers first such that those weights are "pre-trained" as well (at least no longer random weights) before training the full network.
In other words, the network is supposed to perform better with all "pre-trained" weights as the starting point to train the whole network than a mixture of pre-trained and random weights.
Any follow-up on this? I'd like to know the rationale behind the two-phase training as well!
I'm trying to implement transfer learning on a binary class image dataset with well over 10k images, but InceptionV3 overfits badly, while VGG-19 performs perfectly. I did the following as well:
1) Load the Inception model 2) Load the pretrained weights 3) Add bottleneck layers (Dense + BN + Activation + Dropout + Output) 4) Froze the base layers of the model 5) Trained the bottleneck layers for 5 epochs 6) 'Unfroze' the last two inception blocks 7) Re-compiled and re-trained with SGD and a small LR.
I've been facing the same problems (issue #10214) and it has been driving me nuts. Apparently there is a fix (PR #9965) but it is not "official" because it was not merged to master. The fix resolved my problem but it is available only for Keras 2.1.6, not for 2.2.0.
I saw a code that uses InceptionV3 as a pre-trained model but I don't know exactly what I have to put in the selected_layer variable.
this is the link to the code: https://towardsdatascience.com/creating-a-movie-recommender-using-convolutional-neural-networks-be93e66464a7
is there anyone who can help me with it?
According to the Keras document, there are 2 steps to do transfer learning:
Train only the newly added top layers (which were randomly initialized) by freezing all convolutional InceptionV3/Resnet50 layers.
After the top layers are well trained, we can start fine-tuning convolutional layers from InceptionV3/Resnet50 by unfreezing those layers.
That's all good with VGG nets, but due to the use of batch normalization layers, the above procedure doesn't work for InceptionV3/Resnet50, as described in issue #9214 (I don't know why the Keras document provides an example that's not working!)
@fchollet mentioned a possible workaround here:
But this solution (assuming it works) seems to be used to train the newly added top layers only (step 1 above), how to fine-tune the convolutional layers in InceptionV3/Resnet50 (step 2 above) is still unknown to me.