machrisaa / tensorflow-vgg

VGG19 and VGG16 on Tensorflow
2.21k stars 1.08k forks source link

What about use this pre-trained model to train new stuffs? #13

Closed Ripppah closed 7 years ago

Ripppah commented 7 years ago

What should I do if I want to add more class to that pre-trained weight file? So now the class should goes to 1001. So that it can recognize those 1000 objects have already been trained and also can recognize the thing I added in. Will that be possible?

Thanks.

machrisaa commented 7 years ago

It is not an easy question and there is no straight forward solution. We call this kind of training as Transfer Learning. I have a long discussion in here that contain some of my options.

When I want to re-train the network for another purpose, usually I prefer to change the fully connected layers and re-used the other. But your case is different because you want the network to be able to keep the ability to classify the original 1000 categories.

In this case, suppose you want to add n new categories, I will try to modify the code after the fc8 layer like this:

        self.fc_extra_1 = self.fc_layer(self.pool5, 25088, m, "fc_extra_1")
        self.relu_extra_1 = tf.nn.relu(self.fc_extra_1)

        self.fc_extra_2 = self.fc_layer(self.relu_extra_1, m, m, "fc_extra_2")
        self.relu_extra_2 = tf.nn.relu(self.fc_extra_2)

        self.fc_extra_3 = self.fc_layer(self.relu_extra_2, m, n, "fc_extra_3")

        self.combined_result = tf.concat(1, [self.fc8, self.fc_extra_3])
        self.prob = tf.nn.softmax(self.combined_result, name="prob")

The m should be some number between n and 25088 and not too small. The end result is a tensor with shape [ -1 , 1000+n ].

I have not tried this method before tho. Please let me know if it work or not :)

Ripppah commented 7 years ago

@machrisaa Is this mean that I have to re-train the model again with all the 1000 original categories and the new one? Will that be possible to do something like based on the 1000 original categories trained result, I train the new one and add it to the class?

machrisaa commented 7 years ago

Oh, I imagine you want to train the new network on a dataset purely only contain images in the new categories. Then I have another idea that may help. You can modify the VGG to classify n new categories + 1 categories means "unknown". You can do this my completed replace the fully connected layers by new one. You should reuse all the conv layers but make them untrainable.

After this training is completed, create a new network that first classifies the image if it is one of the new categories or not. If not, use tf.cond to connect pool5 back to the original fully connected layer fc6 and classify it as one of the original 1000 categories.

I think this kind of "conditional" training should work. Does it make sense to you?

Ripppah commented 7 years ago

While training the new one, it will mess up the original weight so that the new one could be recognized but the 1000 originals are unrecognized. Which means I still need to train everything again whenever I add new thing. So I am really confused what's the meaning of having pre-trained weight here.

machrisaa commented 7 years ago

You should reuse all the conv layers but make them untrainable.

Did you make them untrainable?

I assume that the original conv layers have strong ability to extract features of an image for further classification. So I suggest to re-use them and keep them unchanged. But of course, if you think your new classes have completely different features as the original class, then you may need to train the conv layers as well. In that case, you need to separate the 2 networks.

Ripppah commented 7 years ago

So the original vgg model is from this guy: https://github.com/AKSHAYUBHAT/TensorFace I want to train face other than objects. He also have the pre-trained weight on face. I am now trying to change his code so that it can recognize my face. I don't know how to explain but if you have time, please help me to take a look on that.

I added tf.nn.dropout to fc layer. Also, I made the last layer's output to be 2623. I start train that with my face label [0,....,0,1].

SO problem I am facing now is the weight been changed to all original face. If I didn't train, and with class 2623, all the things correct.

Please let me know if you have any idea.

machrisaa commented 7 years ago

I made the last layer's output to be 2623

If you changed the last layer to be 2623, you cannot reuse the original weight because unlike conv layer, fully connected layer use matrix multiplication with size = input size * output size. So if you change this layer, it lost the original ability to classify to original 2622 faces. Moreover, you have to train it not only your face, but also the original faces to make sure it doesn't "forget" the original faces during your training.

That's why I suggest you create a network to classify if it is your face or not first. If not, then run the original network to do the original classification. This design is to join 2 networks to be a bigger network. Literally speaking, the 2 networks can be completely independent. But, you can re-use all the conv layers in both networks for easier training.

I suggest you do a basic testing first. Modify the fc layers to classify if a face is your face or not first, i.e. output shape = [ -1 , 2 ]. Make the variables in the conv layers to be untrainable as well. I.e. only train those fc layers you modified.

Once you get a good result, do the next step to join these 2 networks together as I have mentioned. Try to use tf.cond and tf.concat to get this.

Ripppah commented 7 years ago

Thanks, I'll try that. Please keep in touch. I might still need your help.

Ripppah commented 7 years ago

Just want to clarify, the model I understand is like this.

screen shot 2016-12-09 at 10 37 31 am
machrisaa commented 7 years ago

I am afraid it may be incorrect as I assume that your original face VGG cannot detect if a face is not one of the 2622 faces. (correct me if I am wrong)

Instead, you pass it to your new network which has n+1 results, namely n new target faces and 1 additional result stand for "unknown". If it is unknown, then you pass it into the original face VGG.