TadasBaltrusaitis / OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Other
6.71k stars 1.82k forks source link

Bug in gaze direction when looking down. #1031

Open Oscar-L-F opened 1 year ago

Oscar-L-F commented 1 year ago

I've been trying to use openface 2.2.0 for windows to study the gaze behaviour of subjects in front of a camera, but I've run into an error into the gaze_angle_y estimation that perturbs my results a lot.

When the subject is looking down, the estimation of the gaze_angle_y suddenly jumps all the way up. as can be seen in the following screenshot : Capture2

My goal is to detect when the subject is looking above the horizontal, so those errors interfere with my results in a significant way and I'm trying to find a way to either detect those errors and/or at least have an idea of how often I can expect those kind of errors.

I've been trying to use the AU45_c "blink" to detect those errors because I noticed that openface counts "looking down" as a blink, but during those errors the AU45_c is never 0. Therefore, I cannot use it this way, as it actually interprets the situation as the subject looking up with open eyes. I'm also currently trying to use the vertical head angle (pose_Rx) to detect the error by identifying when the gaze is extremely high up and the head very low, but it is very finnicky.

If this error is a known one and someone has found a way to circumvent it It would be great, but otherwise I'd appreciate a lot if you have some data on the gaze estimation error in different situations so I can at least mitigate my results using that. I've found in the openface paper that the absolute mean error on the gaze estimation is 9.96 degrees, but there isn't much explanation on what kind of error is being measured in what situations etc... I'd especially appreciate knowing a bit more about a comparison on the vertical error compared to the horizontal error, or values concerning the gaze error in the extremes.

Best regards,

Oscar

brmarkus commented 1 year ago

We see this a lot...... The trained model (and contained pre- and post-processing) heavily depends on the used camera and PointOfView (POV)., distance and position. The trained model is simply "biased" (e.g. estimations made on the landmarks, eyes, pupils, head-pose). If your camera/POF, distance or position differs (alot) then the model could estimate the angles slightly off. I'm wondering how robust the model is against you wearing that "hat". Looking at the landmarks like eyebrows, where the eyes are "estimated"

Oscar-L-F commented 1 year ago

We see this a lot...... The trained model (and contained pre- and post-processing) heavily depends on the used camera and PointOfView (POV)., distance and position. The trained model is simply "biased" (e.g. estimations made on the landmarks, eyes, pupils, head-pose). If your camera/POF, distance or position differs (alot) then the model could estimate the angles slightly off. I'm wondering how robust the model is against you wearing that "hat". Looking at the landmarks like eyebrows, where the eyes are "estimated"

Is there some data that could help mitigate this ? Like knowing what would need to be changed in the setup to reduce this happening, or how likely this error is in different situations ?

Talking about feature detection, the "glasses" that are worn by the subject are sometimes interpreted as eyes, as seen below :

Capture

right now I'm mostly using the confidence interval to get rid of these errors, as very rarely this mistake happens with a high confidence interval, but I'd love it if you also have info for this matter to share.

brmarkus commented 1 year ago

Is there some data that could help mitigate this ? Like knowing what would need to be changed in the setup to reduce this happening, or how likely this error is in different situations ?

Talking about feature detection, the "glasses" that are worn by the subject are sometimes interpreted as eyes, as seen below :

Capture

right now I'm mostly using the confidence interval to get rid of these errors, as very rarely this mistake happens with a high confidence interval, but I'd love it if you also have info for this matter to share.

We retrained the models on our own (in the context of automotive driver-monitoring), with well known cameras (normal, black/white, infrared), positions and POVs, lightning-scenarios, being (mostly) agnostic to persons warning glasses, hair-styles, wearing masks, looking over shoulders, looking into mirrors - and many more scenarios.