TadasBaltrusaitis / OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Other
6.72k stars 1.82k forks source link

dynamic vs. static AU prediction #940

Closed jckoe closed 3 years ago

jckoe commented 3 years ago

Hi Tadas,

I am using OpenFace for my PhD research project on naturalistic social interactions and am positively impressed by its accuracy.

You mentioned in the wiki that dynamic AU prediction might be harmful for video sequences where the same expression is held. Could you elaborate why this is the case?

Also, what would the disadvantage be of using static AU prediction in video sequences?

Best regards, Jana

TadasBaltrusaitis commented 3 years ago

The calibration step assumes that the most common expression in the sequence is neutral or something close to it (which is true in general as people are more non-expressive than expressive). It uses this assumption to compute a "neutral frame" which is then subtracted from all other frames to base a prediction on. If a person is always holding a same expression the algorithm will mistake that as a neutral harming the performance.

The disadvantage of using static prediction is due to everyone expressing AUs slightly differently, and having access to the neutral expression to subtract is actually very beneficial to prediction of certain AUs (but not all). For more details see Table 10 in the OpenFace paper comparing performance between static and dynamic models.