hysts / pytorch_mpiigaze

An unofficial PyTorch implementation of MPIIGaze and MPIIFaceGaze
MIT License
346 stars 85 forks source link

Some queries #45

Closed dreamer-1996 closed 3 years ago

dreamer-1996 commented 3 years ago

First of all, I would like to congratulate you on this awesome work. While I was reading the article "Appearance-Based Gaze Estimation in the Wild" some questions came in my mind. It will be great if you can clarify those. I am a beginner in CV and Deep Learning so please pardon me if some questions sound stupid.

1) Data Collection: So how exactly the data (the images) are being annotated? I was looking at the dataset and for each subject, an annotation text file is being provided. My query is how do you obtain all those values for the dimensions (like the (x,y) positions of the facial landmarks, the head pose, face center, etc) which has been provided in the annotation files. Also, there are so many images. These annotations are done manually on all of these images or there are some automated process?

2) In case of real time tracking (like predicting the gaze directions from a webcam video) is the normalisation done?The head pose is estimated that I can understand but what about the normalisation?

hysts commented 3 years ago

Hi, @dreamer-1996

There seems to be some confusion, but I'm not the author of the paper. This repo is my personal project to reproduce the results of the paper. So, I think it would be better to contact the authors for more accurate information.

But, anyway, I'd like to make a brief comment on your questions.

As for the first point, I think it's automatically annotated. If the camera is calibrated, face's pose relative to the camera can be estimated using a face landmark detector and a 3D face landmark model. Then you can also calculate the gaze vector from the face position and the information about where on the screen the target is displayed.

As for the second point, the normalization is also applied during real-time inference. Since the model is trained with normalized images, the same normalization must be applied during inference to get decent results.

dreamer-1996 commented 3 years ago

Thanks for your reply. In the MPIIGaze dataset, that is in the MPIIGaze/data/Original/p00/day01 there is an annotation.txt file which contains all the dimensions.This is present in every subject folder for each day.Do you think these dimensions were calculated automatically? At which point of the processing pipeline is this annotation.txt inside the MPIIGaze/data/Original/p00/day01 is used?

Also I looked at the code preprocess_mpiigaze.py. I assume it is using the normalized dataset and creating the training data.Do you have the code for creating the normalized data from the original data?

hysts commented 3 years ago

@dreamer-1996

Do you think these dimensions were calculated automatically?

Yes. Looking at the description of the annotation content on the project page of the paper, it seems they are automatically calculated.

At which point of the processing pipeline is this annotation.txt inside the MPIIGaze/data/Original/p00/day01 is used?

I think it's used to create normalized data.

Do you have the code for creating the normalized data from the original data?

No, but you can find it in the project page of the paper. (It's MATLAB code, though...

dreamer-1996 commented 3 years ago

Thanks for your reply.