Map gaze and head vectors <-> Screen pose

TadasBaltrusaitis / OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

Other

6.86k stars 1.84k forks source link

Map gaze and head vectors <-> Screen pose #945

Closed paulSerre closed 3 years ago

paulSerre commented 3 years ago

Hi everyone,

I think you've already understood what I want to do. My goal is getting the point seen by the user on the screen from the webcam. (a gaze tracker) Two main questions:

1) For my personal knowledge: Is the headpose estimated from the 3D landmarks (geometric model) or by a neural network? (apperance-based model). I ask it because the great majority of gaze tracker do not take the position of the head into account, or do it but with a geometric model. I want to do it but it seems that with a geometric model based on facial landmarks, the position is too dependent on accessories such as glasses. 2) Is the headpose included in the gaze vector or should I include it in my model ?

Thanks for your answers.

bryb31 commented 3 years ago

Hi paulSerre, hi everyone, was wondering if you found a solution to this as we have the same problem! We have a basic calibration to find the corners of the screen viewed by participants, but we are having trouble modifying the transformation to account for viewers movements. All and any advice appreciated

paulSerre commented 3 years ago

I simply made simple method of least squares : a regression with sklearn and Features creation with polynomial features. And for the noise I simply applied a rolling window of size=3 and it improved a lot the result. For 9 points of train and 9 of test, I had an average error of 90 px. However, my model doesn't take into account head movements yet. I didn't manage to do it.

AndyIMac commented 3 years ago

Hi paulSerre, I'm a colleague of bryb31. We're trying to use gaze when participants looking at the corners of the screen on an affine transformation, so we can map their gaze to screen coordinates, say between -1 and +1. We've tried adding in an arctan(pose x/z) or acrtan(pose y/z) to account for position. This seemed to improve some of the results, but not others. We're stuck trying to work out if the head rotations also impact the gaze vectors or not (as I think this might be why some of them didn't improve), and so if we need to transform gaze vectors using pose rotations first. Any advice you've got would be much appreciated