yihuacheng / GazeTR

The codes and models in 'Gaze Estimation using Transformer, ICPR2022'.
117 stars 21 forks source link

pitch yaw & gaze3d #5

Open brianw0924 opened 2 years ago

brianw0924 commented 2 years ago

I think many of the gaze estimation related works misunderstand the term.

  1. yaw pitch is different from spherical coordinate, so your function is wrong
  2. most of the dataset's gaze3d label is from camera's coordinate, so you can't just transform your output to 3d and then calculate the arccos
ffletcherr commented 2 years ago

@brianw0924 I think you are right. It seems most of studies dismissed some important steps. Do you have any idea how to estimate Point of Gaze (PoG) from [pitch yaw + head position + face center ] correctly?

I tried line (3d_gaze_vector) and plane (camera 2d plane, z=0) collision for ETH-Xgaze but the result is wrong for non-zero head pose and error increases exponentially based on head yaw.

fabawi commented 2 years ago

The models (at least in the case of gaze360) produce θ and φ in the spherical coordinate system. You can translate that to the gaze vector (in the eye coordinate system -- the ground-truth g_x,g_z,g_y; There are more details on the conversion from camera's cartesian coordinates to this vector in section 3.1 of the gaze360 paper ) by: g_x = cos(φ) sin(θ) g_y = sin(φ) g_z = cos(φ) cos(θ)

or the other way around: θ= − arctan(g_x/g_z) φ = arcsin(g_y)

Rao2000 commented 2 years ago

@fabawi Hi,fabawi. I think θ = pitch and φ = yaw so your formula maybe incorrect because the representation of symbols is reversed