trishume / eyeLike

A webcam based pupil tracking implementation.
MIT License
917 stars 335 forks source link

Implement Gradient Ascent #3

Open trishume opened 9 years ago

trishume commented 9 years ago

Currently the main tracker algorithm is quite slow, which necessitates the image being scaled down, which reduces accuracy. There is a method proposed by Dr. Timm (creator of the algorithm) in his thesis for using gradient ascent to speed up the algorithm from O(n^2) to O(n) where n is the number of pixels.

The basic idea is that the eye centre-ness field can be sampled at any pixel independently in O(n) time. Currently eyeLike samples all pixels and finds the one with maximum centre-ness. Instead of doing this it is possible to rearrange the formula to be able to compute the gradient (slope) of the centre-ness field at any point. This direction can then be used to "climb" the gradient towards the maximum in a small number of iterations.

The method for doing this is detailed in his thesis, I might be able to email it to anyone who is interested in working on this. The method in his thesis is stated in terms of general circle identification which would need some tweaking to make it work for eyes. It wouldn't require much code to do, but would require a good level of understanding.

marcoxsurf commented 8 years ago

Hi trishume, I'm working on this project and I'm really interested on the thesis you are talking about. I've speed-up the code my way using the eye cascade on the face region. This strategy set the box center near the eyes one so I set the for loop search starting near the box center, you can think about 10px offset from it. It is quite fast now, I think I have some limitations on others strategy I applied

for (int y = (int)(kFastEyeWidth / 2 - radiusEye); y < (int)(kFastEyeWidth / 2 + radiusEye); ++y) {
        const double *Xr = gradientX.ptr<double>(y), *Yr = gradientY.ptr<double>(y);
        for (int x = (int)(kFastEyeWidth / 2 - radiusEye); x < (int)(kFastEyeWidth / 2 + radiusEye); ++x) {
trishume commented 8 years ago

If you have access to a university library you might be able to find it under "Machine Vision for Inspection and Novelty Detection" by Fabian Timm. Otherwise email me at tristan@thume.ca and I may send it to you. It's somewhere in among all those pages, it shouldn't be too hard to find.

The basic idea is you take the partial derivative in the x and y direction of the "centreness" formula you can then use standard gradient ascent formulas to find the maximum.

The derivative in the thesis will need a little tweaking to account for the eye-specific changes to the general circle-finding algorithm. Specifically to deal with only counting black circles rather than white circles by only counting one gradient direction.

abhisuri97 commented 7 years ago

I am going to try and implement this some point down the line: Is the algorithm the one shown on page 27 here? http://www.zhb.uni-luebeck.de/epubs/ediss1030.pdf

trishume commented 7 years ago

@abhisuri97 yup, that's the one. Nice that his thesis is public online, I didn't know that was the case.

That algorithm isn't specialized for eye tracking like in the other paper so it doesn't include things like weighting by darkness (which you could possibly skip, I'm not sure it helps that much), and only counting dark circles on light backgrounds (see my blog post, this is super important to adapt it to do).

I recommend doing the partial derivatives yourself so that you understand the equation he gives and how you might adapt it. I did them once and figured out how to adapt it, but I don't remember, I just remember it wasn't that difficult to adapt or to do the derivatives, and I hadn't even taken high school calculus back when I did them, just looked up the rules for derivatives online, so it doesn't need anything advanced.

abhisuri97 commented 7 years ago

I think I have a good idea of how to go about it. One part I can't quite understand how to do is computing the step size via Armijo's rule. Would you happen to know a good programmatic way to do this?

trishume commented 7 years ago

@abhisuri97 I have no idea. I've never looked into it and haven't studied optimization enough to know what it is. Hope Google helps I guess ¯_(ツ)_/¯

abhisuri97 commented 7 years ago

@trishume Would you happen to have the contact info for Dr. Timm? I couldn't find it online. You can privately email it to me at suria@seas.upenn.edu if you wish to not make the contact information public. Thanks.

progmars commented 4 years ago

Did anybody manage to implement it in some way? It would be great to speed up the process.

Also, it would be nice to have some optimization for real-time processing, such as passing previous pupil coordinates and starting search from there (although not sure if there would be any benefits - we still have to search entire area in case of rapid changes).