hysts / pytorch_mpiigaze

An unofficial PyTorch implementation of MPIIGaze and MPIIFaceGaze
MIT License
346 stars 86 forks source link

Is the gaze angle relative or absolute for head pose? #7

Closed tjusxh closed 4 years ago

tjusxh commented 5 years ago

I preprocess the data and I want to check the data artificially. image The yaws show following, 0.06443524969441083 -0.0026611687925242036 0.14232012996173368 0.23620114918295404 From the four images and yaws , I think the angle is positive when looking left. But The following eyes look right. Their yaw angle shows follow. image

0.11669328876754162 0.09610517002850379 0.1495013605805125

I have confused about the dataset. Anyone help me , Thanks very much.

lucaskyle commented 5 years ago

Content The dataset contains three parts: "Data'', "Evaluation Subset'' and "Annotation subset''.

The "Data'' folder includes "Original'' and "Normalized'' for all the 15 participants. You can also find the 6 points-based face model we used in this dataset.

The "Original'' folders are the cropped eye rectangle images with the detection results based on face detector [1] and facial landmark detector [2]. For each participants, the images and annotations are organized by days. For each day's folder, there are the image collected by that participants and corresponding "annotation.txt" files. The annotations includes:

Dimension 1~24: Detected eye landmarks position in pixel on the whole image coordinate Dimension 25~26: On-screen gaze target position in screen coordinate Dimension 27~29: 3D gaze target position related to camera Dimension 30~35: The estimated 3D head pose based on 6 points-based 3D face model, rotation and translation: we implement the same 6 points-based 3D face model in [3], which includes the four eye corners and two mouth corners Dimension 36~38: The estimated 3D right eye center in the camera coordiante system. Dimension 39~41: The estimated 3D left eye center in the camera cooridnate system. Besides, there is also "Calibration" folder for each participants, which contains:

Camera.mat: the intrinsic parameter of the laptop camera. "cameraMatrix": the projection matrix of the camera. "distCoeffs": camera distortion coefficients. "retval": root mean square (RMS) re-projection error. "rvecs": the rotation vectors. "tvecs": the translation vectors. monitorPose.mat: the position of image plane in camera coordinate. "rvecs": the rotation vectors. "tvecs": the translation vectors. creenSize.mat: the laptop screen size. "height_pixel": the screen height in pixel. "width_pixel": the screen width in pixel. "height_mm": the screen height in millimeter. "width_mm": the screen widht in millimeter. The "Normalized'' folders are the eye patch images after the normalization that canceling scaling and rotation via perspective transformation in Sugano et al. [3]. Similar to the "Original'' folders, all the data are organized by each days for each participants, and the file format is ".mat". The annotation includes:

3D gaze head pose and 3D gaze direction. The generation of 2D screen gaze target to this 3D gaze direction is described in our paper. The folder "Evaluation Subset'' contains:

The image list that indicates the selected samples for the evaluation subset in our paper. We performed evaluations on this evaluation subset of our MPIIGaze dataset, which includes equal number of samples for each participants. The folder "Annotation Subset'' contains:

The image list that indicates 10,848 samples that we manually annotated Following the annotations with (x, y) position of 6 facial landmarks (four eye corners, two mouth corners) and (x,y) position of two pupil centers for each of above images.

u can find the explanation of the whole dataset. but i think the problems are:

  1. transformation a 2d anno to 3D one (i mean they should provide code)
  2. the x,y,z of pose and gaze loading from .mat file (just dont know the way they normalize the input)