MSREnable / GazeInference

Code for running inference on images, from webcams, etc.
MIT License
2 stars 0 forks source link

The project is missing the itracker.onnx file. Where do I find the pretrained model that you reference here? #1

Open scm007 opened 3 years ago

scm007 commented 3 years ago

Any update?

jatinsha commented 3 years ago

The model file is checked in to the repo now.

scm007 commented 3 years ago

Sweet, thanks Jatin.

One quick question: I am attempting to build the gaze inferencing into a mobile app (ios/Android). When the original MIT team constructed the dataset they constructed bounding boxes around teh eyes and the face in order to create the eyeLeft, eyeRight, and face input parameters to the NN. I have replicated this for the most part but obviously the closer the live images are to the original source material the better the predictions will be. You guys did this via dlib / opencv, what considerations did you make in terms of the actual construction of the images? Did you just grab 224x224 pixels around the pupil? Is there padding around the bounding box of the face, etc? In general I'm wondering what kind of things you discovered around actually generating the images.

Thanks!

On Wed, Aug 4, 2021 at 10:27 AM Jatin Sharma @.***> wrote:

The model file is checked in to the repo now.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

jatinsha commented 3 years ago

The original MIT paper used Apple's Circa Face Library. We utilize Dlib instead because it is open sourced, community supported and resulted in much better and tighter bounding boxes. After finding the bounding rectangles from Dlib, we convert them into squares to directly extract square cropped images and resize them to 224px.

scm007 commented 3 years ago

I think you mean 224px no?

On Thu, Aug 5, 2021 at 2:49 PM Jatin Sharma @.***> wrote:

The original MIT paper used Apple's Circa Face Library. We utilize Dlib instead because it is open sourced, community supported and resulted in much better and tighter bounding boxes. After finding the bounding rectangles from Dlib, we convert them into squares to directly extract square cropped images and resize them to 244px.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MSREnable/GazeInference/issues/1#issuecomment-893830772, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA56LRXXGZGQCXYYTN23ELT3MBQNANCNFSM43WIQ2AQ .

jatinsha commented 3 years ago

Yes 224px. Corrected the typo above.

scm007 commented 3 years ago

Thanks! I talked to Petr Kellnhofer and he mentioned needing to subtract the "average image" from the live images. He called this the bias. Are you guys doing that while live inferencing?

Right now I am using Apple Vision to get the contour of the eye as a path. The naive thing to do would be to find the center of the contour and then create a square where the x axis goes from left_edge_eye -> right_edge_eye. However I suspect this will not yield the correct result because the eye images are not "just the eye" and actually contain part of the face as well. How did you guys handle this?

On Thu, Aug 5, 2021 at 7:38 PM Jatin Sharma @.***> wrote:

Yes 224px. Corrected the typo above.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MSREnable/GazeInference/issues/1#issuecomment-893959071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA56LRAILXFOVOLR35HWDLT3NDJFANCNFSM43WIQ2AQ .

joncamp commented 3 years ago

https://github.com/MSREnable/GazeCapture/blob/bcf125d7ceca625d1fe203e81e783f1789d6ebf3/utility_functions/face_utilities.py#L138 has more specifics on how to draw the bounding box. This code is based on how the original gaze capture paper did the bounding boxes, but it was not particularly well documented outside of the code itself.

scm007 commented 3 years ago

Awesome thanks. For the GazeInference project are you calling this python code directly from the C++ application or did you port this algorithm?

On Fri, Aug 6, 2021 at 11:35 AM Jon Campbell @.***> wrote:

https://github.com/MSREnable/GazeCapture/blob/bcf125d7ceca625d1fe203e81e783f1789d6ebf3/utility_functions/face_utilities.py#L138 has more specifics on how to draw the bounding box. This code is based on how the original gaze capture paper did the bounding boxes, but it was not particularly well documented outside of the code itself.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MSREnable/GazeInference/issues/1#issuecomment-894445472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA56LVY6CIO2QHHGWO5LFTT3QTNXANCNFSM43WIQ2AQ .

joncamp commented 3 years ago

The GazeInference pipeline here is c++ and does not use the python code. I am referencing the python code from the GazeCapture pipeline as that is the basis for the model. Any implementation of inference (ie: GazeInference here, or whatever you may choose to build) needs to match the way that images were processed for the model to not introduce error.

scm007 commented 3 years ago

For sure, that's what I'm attempting to accomplish. The most straightforward seems to use the python utilities directly but I am wondering what approach you did from the C++ side of things?

On Fri, Aug 6, 2021 at 12:13 PM Jon Campbell @.***> wrote:

The GazeInference pipeline here is c++ and does not use the python code. I am referencing the python code from the GazeCapture pipeline as that is the basis for the model. Any implementation of inference (ie: GazeInference here, or whatever you may choose to build) needs to match the way that images were processed for the model to not introduce error.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MSREnable/GazeInference/issues/1#issuecomment-894465607, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA56LUHD5A6ACYS65MK6CLT3QX5PANCNFSM43WIQ2AQ .

jatinsha commented 3 years ago

Please refer to https://github.com/MSREnable/GazeInference/blob/master/GazeInference_WinCpp/DlibFaceDetector.h for C++ side of Face Detection and Cropping logic.

scm007 commented 3 years ago

Did you guys do the "subtraction of the mean image (.mat)" as well?

On Fri, Aug 6, 2021 at 12:50 PM Jatin Sharma @.***> wrote:

Please refer to https://github.com/MSREnable/GazeInference/blob/master/GazeInference_WinCpp/DlibFaceDetector.h for C++ side of Face Detection and Cropping logic.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

joncamp commented 3 years ago

We stopped using the .mat files a while ago. We found they didn't significantly improve accuracy. We instead use other normalization techniques as demonstrated in the code

scm007 commented 3 years ago

Is the normalization techniques in that file you referenced?

On Wed, Aug 11, 2021 at 9:10 AM Jon Campbell @.***> wrote:

We stopped using the .mat files a while ago. We found they didn't significantly improve accuracy. We instead use other normalization techniques as demonstrated in the code

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MSREnable/GazeInference/issues/1#issuecomment-896957998, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA56LTOOXY5BNRISVA74CTT4KOHVANCNFSM43WIQ2AQ .