ox-vgg / vgg_face2

645 stars 114 forks source link

Training scheme of the model #25

Closed kevinisbest closed 5 years ago

kevinisbest commented 5 years ago

Hi, I have some questions about the training setting.

In VGGFace, the author initially train a VGG16 based N-ways classifier and then remove the FC layer to fine-tune the network with triplet-loss.

But in VGGFace2, did you just use softmax loss to generate the ResNet-50 pre-trained model on MS-Celeb-1M and also apply softmax loss to fine-tune on VGGFace2 dataset? Thank you!

WeidiXie commented 5 years ago

yes, remove the pertained classifier from MS-Celeb-1M (lots of classes), and fine-tune on VGGFace2 with another classifier (around 8641 classes).

kevinisbest commented 5 years ago

Thanks for rapid answer, so the reason not use triplet-loss is the convergence time or something?

And another question, have you ever tried the SOTA loss functions like Center loss, L-Softmax, A-Softmax Loss or AM-Softmax?

WeidiXie commented 5 years ago

nope, the paper aims to propose a very baseline for the face recognition research, so we didn't try any sophisticated loss or architectures.

kevinisbest commented 5 years ago

OK! I got it! Thank you again!