xucong-zhang / ETH-XGaze

Official implementation of ETH-XGaze dataset baseline
185 stars 33 forks source link

How did you get the groundtruth gaze direction? #16

Open zdw-qingdao opened 2 years ago

zdw-qingdao commented 2 years ago

After reading your paper, i still don't know how to get the groundtruth gaze direction when collecting data. Could you explain this? Thanks!

xucong-zhang commented 2 years ago

Hi,

I updated our project webpage and you can find a short video about the data collection.

I hope it is helpful for you.

Kind regards, Xucong

zdw-qingdao commented 2 years ago

I guess you use the multi-view stereo to obtain the 3d position of eye. The transformation of screen and camera is calibrated then you can calculate the gaze direction. Am i right? Thanks!

xucong-zhang commented 2 years ago

Hi, we did not use the multi-view stereo to get the 3D position of eye, although that will be optimal. We use the fitting the 3D face model (landmarks) to the detected 2D landmarks to get the 3D position of eye and mouth landmarks. This part is written in the paper. You are right about the screen and camera calibration part, that is how we get the 3D gaze target position.

jxncyym commented 2 years ago

@xucong-zhang

  1. I am sorry I don't find the way to get the gaze direction in your paper, could you tell me where you describe the process you get the gaze direction?
  2. you said you use 3D face model, do you mean the 3d face model is a generic 3D face model (landmarks) or a new 3d face model you trained by yourself? And which 2d lanmarks detecting model do you use? after you get the 2d face landmarks and 3d face landmarks, then you use the PnP method to get the 3d position of eye? there I still have a question, which eye do you use to compute the gaze direction, right eye or left eye?
  3. I'm not fully understand how to get the the 3D gaze target postion, do you mean you know world coordinate of the target, then covert to the camer coordinate? could you describe the process detail?