elucideye / drishti

Real time eye tracking for embedded and mobile devices.
BSD 3-Clause "New" or "Revised" License
391 stars 82 forks source link

Data used to train classifiers #735

Open SomeUserName1 opened 5 years ago

SomeUserName1 commented 5 years ago

Hi,

what data did you use to train the classifiers each? There are some sets mentioned in /src/app/landmarks, /src/app/fddb, is there an overall data listing? Would it be possible to share all annotations you guys added?

What other software do you use? I just saw the eos-based landmark fitting, are there software pieces in here that are non-permissive licensed?

What landmark layout are you using for the face and the eyes?

Thanks in advance & cheers, Fabian

headupinclouds commented 5 years ago

What other software do you use? I just saw the eos-based landmark fitting, are there software pieces in here that are non-permissive licensed?

In general, the dependencies should all be permissive: BSD 2/3 Clause, Apache, Boost, etc. Anything with restrictive GPL/LGPL licensing (or similar) is avoided.

The license info for each selected package dependency used in a particular build will be installed here:

https://github.com/elucideye/drishti/blob/378725771b0b4353e5b3fa93f32175023dfc0e0f/src/lib/drishti/CMakeLists.txt#L238

_install/${TOOLCHAIN}/3rdparty/licenses

You can always run a build and check that. That should also be in the tagged binary release assets on the main github page if you want to check quickly. Every dependency will have a find_package() call in the CMakeLists.txt, so you can do something liike: find src -name CMakeLists.txt -or -name "*.cmake" | xargs grep find_package | sort | uniq

The latest ACF detection face models and face landmarks were trained with a subset of the data shared by the JDA github project author here w/ a fair amount of cleaning and editing (68 point model).

I added the C++ ACF + OpenGL acceleration stuff (via Piotr's Toolbox) because it was the only thing that would support the frame rates + search volume I needed on mobile platforms at the time. It works pretty well, and is very fast, but we will probably replace this with a mobile friendly GPGPU CNN eventually. (We are working on adding mxnet/tvm to Hunter to help support this.)

The face landmarks are provided using a modified version of the dlib Kazemi shape_predictor (boost license), where PCA compression was used to support smaller model sizes, and in the case of the eye models, it is also used to support simultaneous regression of eye contour + ellipse (iris/pupil) models in a single model. I was trying to preserve the dense 68 point annotations at this stage for compatibility with other OSS modules, such as eos for head pose estimation, but in practice, it will be hard to achieve real-time performance, and for simple eye localization, the CPR/XGboost regression (trained on faces) or the current 5 point model provided by the dlib repo should also work well. It is probably worth adding this.

The models are all kept in a separate drishti-assets repo to simplify license issue. There is some info about initial models in that repo.

The eye models use a custom dataset downloaded from the internet w/ an annotation scheme described on the README page. It is basically a 64 parameter model w/ eyelid + crease point contours and iris + pupil ellipse components.

eye

I didn't find anything ready-to-go that was suitable for working with eye data at reasonably high resolution. I plan to make the data available, but I need to find some time to organize it and standardize formats. Maybe over the winter holidays :) I've also been accumulating data from unsplash.com, which provides even higher quality images at resolutions suitable for this work, that will also be useful for the face tracking step.