davidsandberg / facenet

Face recognition using Tensorflow
MIT License
13.72k stars 4.8k forks source link

About alignment: The Code in src/align/align_dataset_mtcnn.py does not align the face. #329

Open zhengge opened 7 years ago

zhengge commented 7 years ago

Hi @davidsandberg ,

I'm studying the Machine Learning on Face Recognition recently and I found that your project is a great start so I'm reading all the python codes recently.

However, I found that your code in src/align/align_dataset_mtcnn.py did not align the face, and just cropped the face by bounding_boxes.

bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor)
nrof_faces = bounding_boxes.shape[0]
if nrof_faces>0:
    det = bounding_boxes[:,0:4]
    img_size = np.asarray(img.shape)[0:2]
    if nrof_faces>1:
        bounding_box_size = (det[:,2]-det[:,0])*(det[:,3]-det[:,1])
        img_center = img_size / 2
        offsets = np.vstack([ (det[:,0]+det[:,2])/2-img_center[1], (det[:,1]+det[:,3])/2-img_center[0] ])
        offset_dist_squared = np.sum(np.power(offsets,2.0),0)
        index = np.argmax(bounding_box_size-offset_dist_squared*2.0) # some extra weight on the centering
        det = det[index,:]
    det = np.squeeze(det)
    bb = np.zeros(4, dtype=np.int32)
    bb[0] = np.maximum(det[0]-args.margin/2, 0)
    bb[1] = np.maximum(det[1]-args.margin/2, 0)
    bb[2] = np.minimum(det[2]+args.margin/2, img_size[1])
    bb[3] = np.minimum(det[3]+args.margin/2, img_size[0])
    cropped = img[bb[1]:bb[3],bb[0]:bb[2],:]
    scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear')
    nrof_successfully_aligned += 1
    misc.imsave(output_filename, scaled)
    text_file.write('%s %d %d %d %d\n' % (output_filename, bb[0], bb[1], bb[2], bb[3]))
else:
    print('Unable to align "%s"' % image_path)
    text_file.write('%s\n' % (output_filename))

Thank you for the great project, I really appreciate your work.

lrsperanza commented 7 years ago

I found the same problem. Is it supposed to just crop? When I have rotated faces, the recognition precision goes down.

zhengge commented 7 years ago

@lrsperanza Did you rotate the test dataset?

xizi commented 7 years ago

@zhengge you can see https://github.com/davidsandberg/facenet/issues/93

zhengge commented 7 years ago

@xizi thanks

lrsperanza commented 7 years ago

@zhengge no, I didn't. Because MTCNN detects faces even when they are rotated by 90 degrees, I thought that I could somehow use it to rotate the faces back to a vertical position. When the face is too rotated, MTCNN won't detect the landmarks properly so that I could use them to rotate the faces back to an upright position like described by dougsouza on #93. I'm seriously considering training a neural network for the sole task of detecting if an image is rotated by 90º, 180º or 270º so that it can be rotated back

lrsperanza commented 7 years ago

I've been doing more tests in the last minutes, and it seems that MTCNN doesn't recognize faces when they are rotated by more than 90 degrees. It was returning random picture rectangles as faces, that's why the landmarks weren't working.

zhengge commented 7 years ago

@lrsperanza I don't understand what you said, When the face is too rotated, MTCNN won't detect the landmarks properly. So you think that training a neural network with unaligned face has a better result. Is this what you mean?

lrsperanza commented 7 years ago

@zhengge when the faces are rotated by more than +-90 degrees, it won't detect the face at all, most of the times. I've been considering creating a whole new neural network apart from mtcnn and apart from facenet to detect rotated pictures and rotate them back to the proper position so that mtcnn can detect the faces and facenet can recognize them properly when the user's input image is rotated to begin with

achbogga commented 7 years ago

Firstly, Thanks. Please include the alignment code based on the mtcnn landmark points. I have found out that the current code does not align the cropped face chips. I can prove that the face recognition accuracy improves significantly if the images are aligned properly.

knvpk commented 7 years ago

Yeah, @achbogga is correct. Does anybody has the Tensorflow version of MTCNN alignment.

knvpk commented 7 years ago

@davidsandberg

achbogga commented 7 years ago

I have been reading the facenet paper very closely, the authors claim that the network is so strong that a tight face crop over the face region is sufficient and no need of any rotation/any other transformation. That is why the alignment code produces tight crop rather than rotating the faces. However, the performance seems to increase if the alignment is also performed. Please correct me if I am wrong @davidsandberg.

davidsandberg commented 7 years ago

@pavankumarkatakam: Currently I have no plans of making a tensorflow-only version of MTCNN. @achbogga: I agree. I have tried training with some rotations as extra augmentation and it looked like that could very slightly improve the results. But it was a bit inconclusive. I haven't tried doing some more intricate alignment of the test set only but that could be interresting to try.