Training model with videos

Larumbergera commented 7 years ago

Hi Patrick,

I'm trying to estimate different algorithm's tracking error. For that, I'm using two own databases; one with real faces (maybe you known it) and other with synthetic faces (generated by the 3D Basel Face Model) who mimic the movement of the real one (this one is not published).

Using the Supervised Descent Model on the synthetic database, I have a tracking error in 3 videos with high roll angles (I attach the videos with the landmarks obtained by SDM here). I think it's interesting to stand out that SDM's tracking error without these problematic videos is similar to IntraFace's tracking error.

I think this tracking error may be due to the algorithm training. I have thought about doing a new training with videos with high roll angles. I also think it would be interesting detecting face landmarks using previous frame's landmarks.

What do you think about training with videos?

If we train with synthetic faces, we have true knowledge about landmarks' position every frame. But BFM's synthetic faces don't modify their expression and don't close their eyes. So, maybe training with a mix of synthetic faces generated by your 3DMM (using different expressions) and real videos?

Best regards,

Andoni.

patrikhuber commented 7 years ago

Hi Andoni,

Sounds like a very interesting project!

I think your post is a bit confusing: I am not very clear in each case which "SDM" you exactly mean (I agree it's a bit ambiguous...). When do you mean the original SDM paper (or model) (from Xiong & De La Torre), when do you mean my superviseddescent library, and when (if at all) do you mean one of my older pretrained models (and if you do use them, which one exactly?).

Are you purely evaluating facial landmark detection performance, or e.g. 3D head pose estimation as well? (And if so, how, since superviseddescent only detects landmarks, and not pose?)

I think correcting for roll angle might be quite easy: For example when you're tracking, you can just take the estimate from the previous frame and correct for roll by rotating (in-plane) the image. So if you're just doing tracking, you don't need much roll in the training, and you can correct for +-90° roll angle amazingly well just by rotating the image. I believe IntraFace does this. Of course this doesn't work for still images.

The video you uploaded, is that with landmark detection from the superviseddescent library, and if yes, what model?

What do you think about training with videos?

Probably a good idea, usually, the more data, the better.

training with a mix of synthetic faces generated by your 3DMM (using different expressions) and real videos

We (or rather my colleague Zhenhua Feng) have experimented with real & synthetic data quite a lot - he would be the person to ask. My experience is that it can help but it can't replace real data.

I always highlight that facial landmark detection is actually not my main topic (and never has been). This superviseddescent library is more of a by-product. While the library is actually quite cool and the performance of the trained model quite good, I have in the past 1-2 years only focused on 3D model fitting (and the eos library), where we can work with landmarks from any source.

Cheers, Patrik

Larumbergera commented 7 years ago

I am not very clear in each case which "SDM" you exactly mean

I mean your pretrained model (face_landmarks_model_rcr_68.bin).

Are you purely evaluating facial landmark detection performance, or e.g. 3D head pose estimation as well?

I calculate 3D Head Pose (for each frame) from:

2D landmarks for that frame (background points in the picture).
3D landmarks (calculated previously from 2D landmarks and 3D model).
Camera parameters (focal length, principal ponit ...).

HPE

This picture shows the HPE for the 120th frame of the video "user_04_video_04".

I can calculate tracking error (I've the groundtruth Head Pose) but only with the synthetic database (since I have the true 3D model); with real faces, I have to use 3D Morphable Face Model fitting (with your eos library) and this adds error.

you can just take the estimate from the previous frame and correct for roll by rotating (in-plane) the image

Good idea!

The video you uploaded, is that with landmark detection from the superviseddescent library, and if yes, what model?

Yes, face_landmarks_model_rcr_68.bin (I don't show all landmarks).

We (or rather my colleague Zhenhua Feng) have experimented with real & synthetic data quite a lot - he would be the person to ask. My experience is that it can help but it can't replace real data.

I'll ask him.

Thank you very much!

Regards,

Andoni.

patrikhuber commented 7 years ago

I mean your pretrained model (face_landmarks_model_rcr_68.bin).

I see, cool! Really nice that it performs so well then! :-)

So you're calculating the headpose with OpenCV solvePnP? POSIT? Non-linear optimisation? And you're using this to measure the tracking error? I don't think that's a good idea - Pose estimation with perspective camera (as you do it) is nonlinear and can get stuck in local minima very easily, it depends heavily on your choice of algorithm! (E.g. solvePnP would give very different results than doing the same with Ceres, and even more it also depends on the choice of parameters). But this is just a comment, not really related to your question or this repository ;-)

Best wishes,

Patrik

Larumbergera commented 7 years ago

So you're calculating the headpose with OpenCV solvePnP? POSIT? Non-linear optimisation? And you're using this to measure the tracking error?

I'm using POSIT. It seems to work well and not get stuck in local minima but I will keep it in mind.

Thanks again,

Andoni.

patrikhuber / superviseddescent

Training model with videos #32