xucong-zhang / ETH-XGaze

Official implementation of ETH-XGaze dataset baseline
185 stars 33 forks source link

Few questions about the dataset (gaze, pose) #23

Closed jasony93 closed 1 year ago

jasony93 commented 1 year ago

Hello, I am trying to understand the data structure of eth-xgaze dataset. In 'OnePersonDataset', three values are returned when it is called, which are image, pose, and gaze. It seems that gaze is combination of pitch and yaw which are in radians. (please correct me if i am wrong) I am little confused about what pose does during training. If the pose represents 'what direction the face is pointing at', then how can the pose be defined with one number (unlike pitch and yaw)?