How to get gaze estimation from landmarks？

swook / GazeML

Gaze Estimation using Deep Learning, a Tensorflow-based framework.

MIT License

507 stars 141 forks source link

How to get gaze estimation from landmarks？ #47

Open abnerdesigner1992 opened 4 years ago

abnerdesigner1992 commented 4 years ago

a Nice work！ i have 4 questions about the paper: 1.The original paper says that the author trained a SVR model from MPIIGaze datasite. How to do with that? Useing a trained GazeML model to predict landmarks on MPII Gaze data images and use the detected landmarks results as the ground truth to train a SVR? 2.why not directly using Unityeyes dataset 's landmark and its gaze vector ground truth to train the SVR? 3.why use SVR? using several FC layers to regress result is ok or not?

what does the "calabration with 20 or more samples" mean exactlly in author's paper, calabrating to get camera paramters or get what? don't understand exactlly.

i would appreciate very very very much if the author or others could answer my question. thank you very very very much.

NawwarAdam commented 4 years ago

Where did u find the 1 ? The 36 landmarks-based features are then used to train a support vector regressor (SVR) which directly estimates a gaze directionin 3D, (θ,ϕ) representing eyeball pitch and yaw respectively. The SVR can be trained to be person-independent with a large number of images from different people, or from a small set of person-specific images for personalized gaze estimation.#

2) We found that employing strong data augmentation during training improves the performance of the model in the context of gaze estimation. UnityEyes is too perfect for real-world-scenarios

3) idk but i'll also appreciate it if somebody can answer it :D

4) i don't understand ur q, sry

I hope that'd help u with ur project

Hyrtsi commented 2 years ago

@abnerdesigner1992 are you talking about ELG or DPG paper?

I think they used the simulator to train the model entirely. The MPIIGaze training data can be downloaded using the script provided in this repo so you can check yourself how the dataset looks like and if it's applicable for your suggestion.