davidsandberg / facenet

Face recognition using Tensorflow
MIT License
13.71k stars 4.8k forks source link

Question about the inception_resnet_v1 implementation #605

Open helloyide opened 6 years ago

helloyide commented 6 years ago

At the end of the function inception_resnet_v1(...) in inception_resnet_v1.py after adding 5 x Inception-Resnet-C it calls the function block8() again and assigned the result to end_points['Mixed_8b']:

net = slim.repeat(net, 5, block8, scale=0.20) # 5 x Inception-Resnet-C 
 end_points['Mixed_8a'] = net

 net = block8(net, activation_fn=None) # what is this?
 end_points['Mixed_8b'] = net

In the paper, C Szegedy - ‎2016 - Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning‎ there is only an Average Pooling after 5 x Inception-Resnet-C.

Also at the end of the whole function, after dropout: net = slim.fully_connected(net, bottleneck_layer_size, activation_fn=None, scope='Bottleneck', reuse=False) Why do we need that full connected bottleneck, I also cannot find it in paper.

Could someone please give me some hints? Thanks.

Ao-Lee commented 6 years ago

I try to answer the second question, maybe not correct,

Inception network architecture only have one FC layer, because one fc layer is sufficient for classification. (assign each image with a label number) This project is face varification, meaning that given two images, the algorithm needs to tell if they refer to the same person. This is not a traditional classification task. So inception network need to be slightly modified to do the job (the loss is modified too). You can see the paper "A Unified Embedding for Face Recognition and Clustering" , by google. In the payer they take a network with one fc layer to generate face imbedding presentation and they introduce one additional fc layer to do the classification

maria8899 commented 6 years ago

@Ao-Lee you are right.

@helloyide for the first question, there is indeed an error, the additional block8 should not be there at least according to the original paper that has only 5x "block8" (inception-resnet-C) I don't think it will make a huge difference in the results though.