Closed sturkmen72 closed 8 years ago
Sweet :) Feel free to add this to the dlib users wiki.
I've been trying to train a pedestrian detector using INRIA dataset as well. The training procedure seems to go fine. But I'm not able to detect anything even on the trained data. Does annotation have to do anything with it? Also how does merging the pictures help?
can i see your training.xml
https://github.com/tej06/dlib-pedestrian-detection/blob/master/training.xml
I'm basically annotating each image separately with a fixed aspect ratio.
using INRIA's annotation information to create such training.xml is possible. ( i already done before i will share when i find it ) how much time did you spend to do it?
I wrote a python snippet to extract the information from the INRIA's annotation information. It took me somewhere around 15 to 20 mins. That had an issue where some annotations were not fine. So i annotated them again using imglab. My problem is that after training my percentage of detection is 0.
i found the code i used before to generate annotation file and generated four files. you can find them at https://github.com/sturkmen72/dlib_pedestrian_detection/tree/master/INRIA
i was planning to try creating a detector but did not find some free time yet
NOT: why there are four files.
it was only one file ( inria_test2.xml ) when i create it first. i create it sampling dlib/examples/faces/training.xml. it ignores rectangles when they are out of image boundaries.
i created other files hoping to be useful for you.
inria_test2.xml contains all rectangles even they are out of image boundaries. inria_test3.xml having labels, ignores rectangles when they are out of image boundaries. inria_test4.xml having labels, contains all rectangles even they are out of image boundaries.
Thanks a lot:) I'll train the detector with this.
How does overlapping annotations affect the training procedure?
as i said i did not try yet. i am not too experienced in this field.
what is your purpose to train a pedestrian detector?
It's a part of a project I'm working on.
did you try OpenCV's HOGDecsriptor ?
Yes I did. It works well. But it's a bit fickle. Even with NMS it behaves erratically for video streams. I'm trying to achieve a better detection using DLib. If this doesn't work out I'll resort to OpenCV's HOGDescriptor.
As I recall, INRIA doesn't label all the people in each image. That's why it's not working. There is also an issue of excessively tight cropping which is far from ideal. You want to train on images that look like the real testing images, not ones that are cropped in some obviously target biased way. But most importantly, the dlib training algorithm uses all non-labeled parts of the image as negatives, so if there are any people in the background it's not going to work very well and will likely learn that the lowest error behavior is to never detect anything. This feature is why you don't need to deal with hard negative mining. It does that for you, but it's obviously going to do something bad if the background is full of unlabeled people.
You should get a more modern dataset, like the Caltech Pedestrian Detection Benchmark and take care that you train on images that are fully and consistently labeled.
Also, this is such a common question that I just added a new FAQ for it: http://dlib.net/faq.html#WhydoesnttheobjectdetectorItrainedwork :)
Thanks a lot @davisking This sure did help a lot:)
No problem. I should have written this FAQ answer a long time ago :/
Hi @davisking , I've found this closed issue and I want to share the work I've done so far with some dataset publicly available on internet: to train the pedestrian classifier, I'm usingthis one which contains some scenarios with partial occluded elements as well. All the data from USC is shipped with ground truth data, which makes it easier to be used with dlib, therefore I've written a parser for those XML data and converted in the format supported by dlib (the same used for face_detection). If anyone is still interested, you can find the parser (written with Python 3) here: Conversion of Ground Truth data for the USC dataset.
By the way, thank you for all the efforts you put in this work!
i have trained a pedestrian detector with INRIA dataset can be found at https://github.com/sturkmen72/dlib_pedestrian_detection
i will share some more information about how to train it soon on the github repo.
many thanks for @davisking for the great library.