Closed ffletcherr closed 1 year ago
Shouldn't the gaze vector be different if the camera position is placed differently?
You are right, that was my mistake. I thought the output of the network is [pitch, yaw] in Head Coordinates System, but it is in the normalized
Camera Coordinates System.
So to get the gaze vector in the real CCS, denormalizing method can be used. In this step, we have face.center and face.gaze_vector (origin is face.center) in real CCS.
Finally, for calculating the Point of Gaze (in CCS), we must find the intersection of the face.gaze_vector and a plane with normal_vector = [0, 0, 1]
, which is the camera plane.
Hi, thanks for this great paper and dataset and also all of your previous valuable works in the field of appearance-based gaze estimation.
I recently tried to use the raw output of the network, which is trained on the ETH-XGaze dataset, to estimate the PoG (Point of Gaze) in CCS (Camera Coordinates System). So I used your normalization method and find the normalizing rotation matrix to transform the normalized gaze vector which is in HCS, to the 3D gaze vector which is in CCS.
But it seems that pitch and yaw are not in HCS because when everything is unchanging, except the camera position, the network output changes. So if it is correct and pitch and yaw are not in HCS, we need an extra step further than a normalizing rotation matrix which compensates head pose. But I can't find this step and it is ambiguous for me.