davidsandberg / facenet

Face recognition using Tensorflow
MIT License
13.75k stars 4.81k forks source link

mtcnn face alignment? #93

Closed dougsouza closed 7 years ago

dougsouza commented 7 years ago

@davidsandberg, I've checked your code for mtcnn face alignment and saw that there is not really an "alignment" going on, it just a crop with a margin around a bounding box, am I correct? It sounds weird to me why in this field they call the landmark detection as "alignment". Anyway, do you think that performing a simple 2D alignment (rotation) using the landmarks from the mtcnn would improve the results?

Cheers,

Doug

davidsandberg commented 7 years ago

The MTCNN is very good at detecting profile faces which is very nice. But it's not clear to me how to apply a 2D transformation that does not cause severe distortions to e.g. profile faces (where estimated positions for the eyes will be in the same position). I know that for example DeepFace uses 3D alignment which seems to work pretty well, but I guess it becomes algorithmically more tricky. So far my approach has been to just use the face bounding box and let the model generalize over different face poses.

dougsouza commented 7 years ago

I understand. Well, in my opinion a 2D alignment wouldn't distort the image, as we just need to rotate in a way that the face stays in vertical position (just think of cases where the neck is bent to one side and the face is not completely in vertical position). The reason is that the convolution is invariant to translation but not rotation, so I think it would improve in some way. I am going to perform some experiments with 2D alignment, I also have code for 3D alignment that I will try as well.

dougsouza commented 7 years ago

@davidsandberg, I tried the 2D alignment as I mentioned above, then I got this:

Runnning forward pass on LFW images Accuracy: 0.985+-0.006 Validation rate: 0.90600+-0.02119 @ FAR=0.00069

Any thoughts? It is a very slightly improvement I guess

CalabashBoy commented 7 years ago

@dougsouza . I tried to run this FaceNet , but I get this error : I get error no module named facenet. how run this demo? thx

ugtony commented 7 years ago

Is it reasonable to replace the alignment step with the Spatial Transformation Networks? In this way alignment and feature extraction can be trained together.

It's also possible to use the STN to select "cropped patches" instead of manually designed in "Native-Deep Face Recognition: Touching the Limist of LFW Benchmark or Not?"

mhaghighat commented 7 years ago

One way can be the piece-wise affine warping used in the Active Appearance Model. Each triangle in the face mesh has a corresponding triangle in the frontal mesh, which it can be warped into. The affine transformation for each triangle can be calculated using the coordinates of its vertices returned by the landmark detection and the corresponding landmarks in the frontal face template (imgDim * MINMAX_TEMPLATE) in the align_dlib.py.

image

ntvy95 commented 7 years ago

@CalabashBoy I have followed this and problem solved. Hope this helps.

MrXu commented 7 years ago

Hi @dougsouza , thanks for pointing out this. May I know how you perform 2D alignment?

dougsouza commented 7 years ago

@MrXu I just use the landmarks of the eyes to calculate the angle the image needs to be rotated so the eyes become horizontally aligned, then I rotate. The only issue is that we need to detect the face twice, one before rotation, and after we rotate the bouding box is no longer good, so we need to detect the face again. It is not very practical for real time applications I guess.

tengshaofeng commented 7 years ago

@dougsouza hi, after 2D alignment, you get only a slightly improvement? i think it should improve much. what is the dataset you use for training?

dougsouza commented 7 years ago

@tengshaofeng ,

I didn't train my own model with aligned faces, I just aligned LFW and evaluated on the model trained in this repo. If we train the model with aligned faces it may improve accuracy, but I don't know, we would have to test.

tengshaofeng commented 7 years ago

@dougsouza, ok, Maybe I will train with aligned faces latter for testing. Thanks for your reply.

rawmarshmellows commented 7 years ago

@ugtony Hey Tony did you try putting spatial transforms in the network?

ugtony commented 7 years ago

@kevinlu1211 , Yes, I did. But I found the spatial transformer layers are difficult to train unless I freeze the parameters of other backend layers. When I freeze the backend layers, the spatial transformer layers tend to shrink and the image, in other words, the transformer image is smaller and the surrounding region is filled with blank color, which is different to what's shown in the original paper. I don't know why. I can see it makes the frontal faces vertical, but the accuracy isn't improved because of it.

There is a 2017 paper did the same thing. And the author's claim the face recognition accuracy is improved by introducing the layes. But I cannot reproduce it. They tested it with a weaker classifier and the improvement wasn't so significant. So I stopped trying.

rawmarshmellows commented 7 years ago

@ugtony Could you link me to the paper? And did you implement your own spatial transformer or did you use the tf.contrib.layers? Also how did you set up your normalization module? Also you could try finding the face landmarks using other methods, step 2 in this tutorial has links to the source code

ugtony commented 7 years ago

@kevinlu1211 The paper is Towards End-to-End Face Recognition through Alignment Learning. I will read your link later, thanks.

davidsandberg commented 7 years ago

Closing for now. Reopen if needed.

jerryhouuu commented 6 years ago

@dougsouza Excuse, how do you do landmark detection, dlib or mtcnn-tensorflow?

dougsouza commented 6 years ago

@JerryHouuu I used the mtcnn implementation from this repo.

rawmarshmellows commented 6 years ago

@jerryhouuu this might help https://github.com/kevinlu1211/FacialClusteringPipeline

jerryhouuu commented 6 years ago

@dougsouza @kevinlu1211 thank u guys, i use mtcnn from this repo to do face alignment like @dougsouza do that ! its work

beimingmaster commented 6 years ago

i make alignment with lfw images by dlib, but got 98% accuracy, why? i didn't retraining. anyone retraining new model with aligned images?

Hamiltonsjtu commented 5 years ago

I am confused by this problem as well. @dougsouza MTCNN detects faces in images and crops faces with margin all the code do not align faces. And I have two problem?

nyck33 commented 5 years ago

Is there a training script to retrain the MTCNN landmarks for profile faces on the i-Bug menpo datasets?