train_object_detector example shows incorrect testing results

e-fominov commented 8 years ago

While training some object detectors I have found that when detector is loaded from .svm file and used in real-world scenario, its precision is different from the results shown at training process. Precision/Recall/AP results are different. And the reason is because my training/testing data has many overlapped objects

I have found that when object detector is trained in sample code, line 291: object_detector<image_scanner_type> detector = trainer.train(images, object_locations, ignore); Trainer has no custom test_box_overlap initialized in its params and it will make automatic test_box_overlap training. When the process will be finished - detector will have some trained test_box_overlap parameters.

But when sample code is testing object detector (line 297) - it does call test_object_detector function with default argument test_box_overlap:

cout << "Test detector (precision,recall,AP): " << test_object_detection_function(detector, images, object_locations) << endl;

So test_object_detector will not use trained test_box_overlap instance from detector, but will use some default-constructed one for making decisions about detector's precision. Really - threre will be two different overlap testers used - one when detector makes detections and other for calculating precision.

Is this the correct way to calculate detector's precision?

May be we should update sample code to use detector's trained overlap tester when calling test_object_detector?

`

davisking commented 8 years ago

The test_object_detection_function() should also include the ignore boxes. But the overlap testing is pretty standard. All the major object detection challenges use the default settings of that object. The trainer does it differently internally because that improves the accuracy of the trained model. One use of the overlap tester is for doing non-max suppression and another is for testing against the standard accuracy measure. So it's as intended.

I just fixed the missing ignore issue though. That wasn't right :)

e-fominov commented 8 years ago

For example, I have dataset with 100 boxes. I have trained object_detector and got 1/1/1 result I am happily trying to run all this images manually without calling test_object_detector. I am simply calling detector(img) for each image and trying to compare the results with training dataset. And during this process detector detects 120 boxes instead of 100. I am expecting that 1/1/1 detector will be 100% true on training data, but it gives me 20 more boxes than I am expecting

Yes, everything works correctly, but I didnt find any explanation of this feature in documentation/sample code text

davisking commented 8 years ago

What code output 1/1/1? There is only the test_object_detection_function(). Training doesn't output an accuracy metric.

e-fominov commented 8 years ago

yes, test_object_detection_function() gives me result of 1.0 precision, 1.0 recall, 1.0 average precision But in real usage detector produces 20% more boxes in my case - on the same training data. This is because test_object_detection function uses default overlap tester and there is no comments about it in sample text

davisking commented 8 years ago

I don't understand what's different between the two cases. You call test_object_detection_function() and in one case you give it an overlap_tester (the 5th argument) and in a different setting you don't (so you get the default overlap tester)?

But if you ran the same code on the same data you should always get the same outputs. Are you running the same code in both cases?

e-fominov commented 8 years ago

yes, the code usage is different:

1) test_object_detection_function() with default 5th argument - like in example code 2) manual running trained detector with image and counting result boxes. here i am expecting to get the same amount of detections as in training data, but it gives me 20% more. Here my expectations are incorrect because I didnt know that overlap tester inside detector and in test_object_detection_function are different

I suggest to add some more text into documentation. I will make PR for it, and I think, this will explain my situation better

davisking commented 8 years ago

No, something is wrong. What you are describing doesn't make sense to me.

So you run test_object_detection_function() with default overlap testing and it says 1/1/1 right?

Then you run the detector by doing something like dets = detector(img); and you look at the results. And you see that it has many false alarms? That should not happen. If the test said 1/1/1 then the detector should have perfect outputs when you call it like detector(img).

e-fominov commented 8 years ago

Yes, this is what I have. dets = detector(img) gives me a lot of false alarms. When I look into detector.overlap_tester, it has trained matchthresh = 0.627 (default is 0.5) So when I am running dets = detector(img) - some false alarms are not merged, but test_object_detection_function() internally makes second pass on dets and remove false alarms with match_thresh=0.5

BUT i am running test_object_detection_function() with default 4th argument (same as example code): test_object_detection_function(detector, images, object_locations)

if I change this code into test_object_detection_function(detector, images, object_locations, detector.get_overlap_tester())

I will get correct results

davisking commented 8 years ago

I'm not sure that's right. Where in test_object_detection_function() is the second pass that uses the overlap tester? The only thing I see is where it uses the overlap tester to check against the ignore boxes. Do you have ignore boxes in your dataset?

e-fominov commented 8 years ago

test_object_detection_function calls this:

 for (unsigned long i = 0; i < images.size(); ++i)
        {
            std::vector<std::pair<double,rectangle> > hits; 
            detector(images[i], hits, adjust_threshold);
            correct_hits += impl::number_of_truth_hits(truth_dets[i], ignore[i], hits, overlap_tester, all_dets, missing_detections);
            total_true_targets += truth_dets[i].size();
        }

where overlap_tester is default (4th argument). hits will contain false detections missed by detector's overlap tester and impl::number_of_truth_hits will ignore them:

            unsigned long count = 0;
            std::vector<bool> used(boxes.size(),false);
            for (unsigned long i = 0; i < truth_boxes.size(); ++i)
            {
                bool found_match = false;
                // Find the first box that hits truth_boxes[i]
                for (unsigned long j = 0; j < boxes.size(); ++j)
                {
                    if (used[j])
                        continue;

                    if (overlap_tester(truth_boxes[i].get_rect(), boxes[j].second))
                    {
                        used[j] = true;
                        ++count;
                        found_match = true;
                        break;
                    }
                }

                if (!found_match)
                    ++missing_detections;
            }

e-fominov commented 8 years ago

we can also change

const matrix<double,1,3> test_object_detection_function (
        object_detector_type& detector,
        const image_array_type& images,
        const std::vector<std::vector<rectangle> >& truth_dets,
        const test_box_overlap& overlap_tester = test_box_overlap(),
        const double adjust_threshold = 0
    )

into something like

    const matrix<double,1,3> test_object_detection_function (
...
        const test_box_overlap& overlap_tester = detector.get_overlap_tester(),
...
    )

davisking commented 8 years ago

You are saying that the precision is wrong. It said it was 1 but really there are false alarms so it should be something lower. But the code you posted isn't what computes precision. It's here: https://github.com/davisking/dlib/blob/master/dlib/svm/cross_validate_object_detection_trainer.h#L142

and total_true_targets is just the size of all_dets which is populated here: https://github.com/davisking/dlib/blob/master/dlib/svm/cross_validate_object_detection_trainer.h#L79

So the only effect the overlap tester has on the precision is that it can exclude false alarms if they match one of the ignore boxes according to the overlap tester. Does that all sound right?

davisking commented 8 years ago

I definitely don't want to change the default to be detector.get_overlap_tester(). These uses are pretty different.

davisking commented 8 years ago

If anything there should probably be two overlap testers given to the test function, one for matching against the truth boxes and one for matching against the ignore boxes. That might be excessive though.

e-fominov commented 8 years ago

So the only effect the overlap tester has on the precision is that it can exclude false alarms if they match one of the ignore boxes according to the overlap tester. Does that all sound right?

No, if false box is near correct box and it was not removed by trained detector's overlap_tester, it is possible that this box will not be counted as false in impl::number_of_truth_hits, because this condition will become true if (overlap_tester(truth_boxes[i].get_rect(), boxes[j].second)) (https://github.com/davisking/dlib/blob/master/dlib/svm/cross_validate_object_detection_trainer.h#L61):

davisking commented 8 years ago

But only one box is allowed to match against each truth box. If there are 10 truth boxes and 11 detections then there will be 1 false alarm no matter what the overlap settings are (assuming for the moment that ignore is empty).

e-fominov commented 8 years ago

No. Each box can match truth box only once, but one truth box can match many boxes. This is completely different. Look at code;

            unsigned long count = 0;
            std::vector<bool> used(boxes.size(),false);
            for (unsigned long i = 0; i < truth_boxes.size(); ++i)
            {
                bool found_match = false;
                // Find the first box that hits truth_boxes[i]
                for (unsigned long j = 0; j < boxes.size(); ++j)
                {
                    if (used[j])
                        continue;

                    if (overlap_tester(truth_boxes[i].get_rect(), boxes[j].second))
                    {
                        used[j] = true;
                        ++count;
                        found_match = true;
                        break;
                    }
                }

                if (!found_match)
                    ++missing_detections;
            }

Used flags are for detected boxes - not for truth ones

davisking commented 8 years ago

How can multiple boxes match a truth box? There is a break; inside the matching if that immediately stops the search once a match is found.

e-fominov commented 8 years ago

Yes, you are right here and I am wrong.

The problem is somewhere deeper. Detected box is not overlapping with truth if I am using detector's trained overlap_tester. And this box overlaps if I test with default test_box_overlap

I am getting this situation: 1) overlap_tester(truthA, box1) == true (really the box is not where I am expecting to see it) 2) detector.overlap_tester(truthA, box1) == false

And sometimes this: 1) overlap_tester(truthA, truthB) == true 2) detector.overlap_tester(truthA, truthB) == false 3) overlap_tester(truthA, box1) == true 4) overlap_tester(truthA, box2) == true 5) overlap_tester(truthB, box1) == true (already used) 6) overlap_tester(truthB, box2) == false Here we dont have any meaning about which box overlaps more - only overlaps and this is enough. But this can be incorrect if one detected box overlaps two truth boxes - they can be used in incorrect order

My data is some specific pedestrian images, and many of them are crowded i.e. overlapping in training dataset

davisking commented 8 years ago

Yeah, the ordering isn't optimal. It would be better if the test function did something like "best match first" rather than taking the first match it found.

e-fominov commented 8 years ago

So as finalizing the problem this two ways of testing are producing different results: 1) test_object_detection_function(detector, images, object_locations) 2) test_object_detection_function(detector, images, object_locations, detector.get_overlap_tester())

While 1'st way is used in dlib's examples, and 2'nd way is how detector makes decision internally for removing false boxes. The reason of this situation is large amount of overlapped training boxes. If trained data does not overlap much - end user will not get this problem My initial idea was to describe this somewhere

I have solved my situation by changing overlap_tester parameters manually

And ordering is a problem, but not big. Do you want to try improving test_box_overlap? We can make it return some double and use it for best matching

davisking commented 8 years ago

Yeah, it is a bit confusing.

That's probably a good idea to improve the ordering. I'll put it on my TODO list, but I won't get to it for a while. But you can certainly submit a PR for it if you want :)

e-fominov commented 8 years ago

I am closing issue. If I will have time - will make PR about ordering

davisking commented 8 years ago

Sweet

davisking / dlib

train_object_detector example shows incorrect testing results #202