qiexing / face-landmark-localization

cnn network predict face landmarks (68 points) and head pose (3d pose, yaw,roll,pitch).
410 stars 184 forks source link

How was trained the model? #5

Open affromero opened 8 years ago

affromero commented 8 years ago

Nice model you have got! Thanks for sharing it.

I wonder what kind of data you fed into the network in order to train, I mean, What database do you use to get so many keypoints and also 3d pose?

As you use a face detector, do you perform face cropping before feeding the data into the network?

Could I get more information about the training stage? Would it work for vgg-network?

Another last thing, why did you transform the prediction into this: predictpoints = predictpoints * vgg_height/2 + vgg_width/2 How the labels were normalized?

qiexing commented 8 years ago
  1. i use a software called intraface, it take 49 points ( those 49 points are from 68 ground truth points) and can generate 3d pose.
  2. the face detector is dlib, put the cropped face ( resize to 224, 224) in to network
  3. the label normalize is to make the label range in ( - 1, 1 )。 by ( x - 224/2) /( 224 / 2) (as 224 is the width)
Magicyu-2015 commented 8 years ago

I use the software called intraface. but is only take 12 facial landmarks.The GUI version of intraface is not available for taking 49 points. Which version of intraface used in your training stage?

qiexing commented 8 years ago

This is the intraface I used, you can download from here: https://pan.baidu.com/s/1jIHFSQa.

Magicyu-2015 commented 7 years ago

The intraface can only take 49 points.How do you process your training data for detecting 68 points?

qiexing commented 7 years ago

select 49 points from 68 points, you can see the intraface code.

Magicyu-2015 commented 7 years ago

I see the intraface code. In the intraface code, it define an object called fa by 'INTRAFACE::FaceAlignment fa(detectionModel, detectionModel, &xxd);'. I see this code 'fa.Detect(frame,*max_element(faces.begin(),faces.end(),compareRect),X0,score) != INTRAFACE::IF_OK' and i print the size of X0. The result shows figure 49. How do i change this code to generate 68 points?

qiexing commented 7 years ago

@Magicyu-1990 intraface predicts 49 facial points and uses this 49 points to genenrate 3d head pose. I use intraface only to generate 3d head pose ground truth labels. I use 68points labels from ibug: http://ibug.doc.ic.ac.uk/resources/facial-point-annotations/

Magicyu-2015 commented 7 years ago

Hi!I used intraface to generate 3d head pose ground truth labels for 300w face dataset. But i got some errors. In the 300W face dataset, the intraface can not to generate the 3d head pose for some pictures. How do you deal with this problems? issues2 issues3

qiexing commented 7 years ago

@Magicyu-1990 intraface could not generate the 3D head pose because of the opencv face detection failure. I ignore these failure images.

Magicyu-2015 commented 7 years ago

@qiexing I got confused. If i ignore these failure images, i worry about the number of pictures in the training set. The 300W face dataset has only 300 indoor pictures and 300 outdoor pictures. Training this network will be over-fitting in 300W dataset. How do you tackle these problems?

Magicyu-2015 commented 7 years ago

@qiexing I use data augmentation to increase the 300W face dataset with rotation,slide and blur.How do you handle with keypoints labels with the image rotation?

qiexing commented 7 years ago

@Magicyu-1990 The 300 W data is 3000+。 If you do rotation, slide on the images, the position of the landmarks should be transformed respectively. I use data augmentation to enlarge the dataset to 30000+ training images.

XiaXuehai commented 7 years ago

@qiexing The 300 W data is 3000+.Does it include the dataset like Helen and IFPW etc. Unzip the 300-w.I just got only 300 indoor pictures and 300 outdoor pictures. 11

Magicyu-2015 commented 7 years ago

@qiexing I used face detector to detect face and crop the face area to 224*224. But how can i deal with the facial landmarks labels from the original image? The image was cropped to the small size and the landmarks labels are not corresponding to the cropped image. Can you share your valuable experiences for this? github

qiexing commented 7 years ago

@XiaXuehai Yes, I used LFPW, Helen, Afw, Ibug. All annotated with 68 points.

qiexing commented 7 years ago

@Magicyu-1990 . For example. if face bbox is (20, 30, 100, 110) . (20,30) is the left top point, (100, 110) is the right bottom point. The original landmark is (45, 55) , then in the cropped image, it should be (45-20, 55 - 30). I think the the best way to check if the transformed landmarks are correct is to : display the cropped image and draw transformed landmarks.

XiaXuehai commented 7 years ago

@qiexing I have a question. Using intraface‘s face landmark and pose as ground truth is correct?And if we can get the pose by the intraface correctly, why we still use cnn?

Magicyu-2015 commented 7 years ago

@qiexing Why do you normalize the ground truth of facial landmarks to the range of( - 1, 1 )? I see you training code.Your networks has two Loss function. Did you use the multi-task leaning in your training?

qiexing commented 7 years ago

@Magicyu-1990 it's a normalization which will help to learn the landmark positions. Yes, it's multi-task learning.

sunmiaobo commented 7 years ago

How do I use interface, ubuntu can yao, a corresponding document?

qiexing commented 7 years ago

@beita-lab , you mean the intraface document or something else ?

Magicyu-2015 commented 7 years ago

@qiexing I see your code to transform the pose prediction into this: predictpose[i] = pose_prediction * 50. When you prepared your training data, How did you normalize the head pose label get from the interface?

qiexing commented 7 years ago

@Magicyu-1990 Most of the head pose of images detected by intraface are ranging in (-50, 50). I do the normalization, so they can ranging in (-1, 1).

Magicyu-2015 commented 7 years ago

@qiexing Thank you for your kindest help! If i do random rotation on images with angles in (-5,5),(-10,10),how do i calculate the corresponding landmarks with the rotation image?

qiexing commented 7 years ago

@Magicyu-1990 do the corresponding rotation on the landmarks. Actually, the image rotation is the point rotation + interpolation.

tanghy2016 commented 5 years ago

select 49 points from 68 points, you can see the intraface code.

你好,请问这49个点是怎么选择的呢?因为DemoDetector中得到的49个点并不是完全在68个点内,我尝试使用全部68个点输入EstimateHeadPose中得到的三个角度看上去不太对。去除68个点的外部轮廓17个点,得到的三个角度看上去还是不对。

qiexing commented 5 years ago

select 49 points from 68 points, you can see the intraface code.

你好,请问这49个点是怎么选择的呢?因为DemoDetector中得到的49个点并不是完全在68个点内,我尝试使用全部68个点输入EstimateHeadPose中得到的三个角度看上去不太对。去除68个点的外部轮廓17个点,得到的三个角度看上去还是不对。

你好,我记得是用68个点的工具对图片先生成68个点,然后选择人脸内部的49个点,你可以参考一下interface的demo看看。