liuliu / ccv

C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library
http://libccv.org
Other
7.08k stars 1.71k forks source link

Inconsistent results in icfdetect under mingw #103

Closed lxd4 closed 10 years ago

lxd4 commented 10 years ago

After modifying ccv to be able to compile under mingw, I get odd behavior when trying out icfdetect. These modifications are mostly altering linux headers to their windows equivalents, arpa/inet.h to Winsock2.h and alloca.h to malloc.h. See the diff at the bottom.

Using the model samples/pedestrian.icf, a typical ubuntu install detects one person correctly (the left woman) inside samples/street.png , and not the right-side man. Compiling with mingw, however, and using a converted bitmap image, it gets a detection in entirely the wrong place (0,0 as the upper left corner of the bounding box) and takes 700 ish seconds.

It takes a long time because there are 307958 returned detections from _ccv_icf_detect_objects_with_classifier_cascade(...) that must be handled by ccv_array_group(...) inside of ccv_icf_detect_objects. In the ordinary linux install, there are only 2 rectangles prior to grouping. Street.png is 640x425 pixels, so that is an unrealistic number of returned rectangles. but not a fortunate diagnostic number like 0 or the theoretical maximum of every true positive over the whole pyramid.

I am unsure why there are so many false positives, and why they reduce to that single first window after filtering the rectangles. I haven't put in any of the external dependencies for mingw to use (libpng, libgsl, etc), but hopefully that is not the source of my problems using icfdetect.

Thanks everyone!

git diff: http://pastebin.com/73ERwy44

lxd4 commented 10 years ago

I think I've narrowed down the problem to an issue with reading the model file. Echoing each line of samples/pedestrian.icf as it's read, from the second entry on the third line to the end it is reading all 0's. This might be a problem with fscanf in mingw, I will keep digging.

lxd4 commented 10 years ago

I've found the source of the issue. The problem is that mingw, at least in the vanillla release, depends on msvcrt.dll for runtime. MSVCRT is not C99 compliant, and that includes things like fprintf and sscanf support for the "%a" format and the associated hex representations of floats. Thusly, the MSVCRT version of freadf would ignore the %a formatting and end up not setting the parameters correctly - wrong model, many false alarms, etc. etc.

I've made a workaround, but I doubt it would be kosher for a pull request, considering what I assume is a firm commitment to using C99 inside of CCV. The workaround was, in brief,

Compiling in mingw -modifying the dependence on arpa/inet.h to Winsock2.h and alloca.h to malloc.h -(not important) separately compiling zlib and libpng, installing their include/lib/bin files into the base mingw directories (/not/ the msys directories), PM me if you are attempting something similar and want details

Converting pedestrian.icf to store decimal values instead of hex float values -in a C99 compliant program, (I used a clone of ccv on an ubuntu machine), read in pedestrian.icf using ccv_icf_read_classifier_cascade(...) as usual -write a method that mirrors the structure of _ccv_icf_write_classifier_cascade_with_fd, but with %f instead of %a, append it's declaration to ccv.h, and call it in your program to write the classifier cascade using decimal representation of the floats instead of hex representation of the floats. It looks like I might have some accuracy loss here, as the read in values don't match the original values in pedestrian.icf. -these operations could probably be an easy ruby file =) -take the exported model file and bring it to a location accessible to your mingw configuration for ccv. I have a converted model available here: http://pastebin.com/2UquvtLu

Modifying ccv_icf to read decimal values instead of hex float values -In your mingw ccv configuration, modify _ccv_icf_read_classifier_cascade_with_fd or create a new, equivalent method mirroring the original with "%f" instead of "%a" for the various float parsing.

Hopefully this is helpful. Now, I turn to trying to incorporate this build of ccv as an external library into my VS2010 C++ application...

liuliu commented 10 years ago

Thanks, this is very helpful to know. Yes, c99 is a commitment (thus, it is hard to run on half supported c runtimes). However, I would like to improve feature detection to avoid subtle issues like this. Also, all existing models would likely to move to a sqlite3 based model file (after the deprecation of bbf* models) like the most recent deep learning classifier.