Open mmartin56 opened 4 years ago
It starts from the 1st detection (and more precisely, from the last): https://github.com/AlexeyAB/darknet/blob/cd1c7c32af3f40610aa8ed4c0a3d6ebfdd958460/src/detector.c#L1163-L1165
But if there's only one (correct) detection, then detections_count
is 1
Which implies that the for loop will not run ( for (rank = -1; rank >= 0; --rank)
)
Which implies that the line avg_precision += delta_recall * last_precision;
will not run.
Which means avg_precision
will remain zero.
If the detection is correct (true positive) then avg_precision
should be > 0.
If there's only one detection, does it mean the average precision will be zero?
Yes.
detections_count is a count of all detections - correct and incorrect detections for confidence_threshold 0.005 https://github.com/AlexeyAB/darknet/blob/cd1c7c32af3f40610aa8ed4c0a3d6ebfdd958460/src/detector.c#L875
try to run ./darknet detector test ... -thresh 0.005
how many detections do you see?
Hi Alexey, I now see two problems in the code for calculation of area under curve.
1) As per our above discussion, there is a problem if there's only one detection (and there can be). Before line 1165 (beginning of for loop) you need to add a line
avg_precision += last_recall * last_precision;
Otherwise, your calculation will be erroneous, because you omit the point of lowest precision, which will produce a too low estimate of the mAP (but the error will not be large on big test sets, I admit).
In response to 3. above, I don't see any mathematical reason why you should need at least 2 points. Assume there's only one detection (even with threshold = 0.005), and one object, and and an overlap > 0.5. Then, precision = 1 and recall = 1 for that point, so AUC should be 1. By adding the above line you do get avg_precision
= 1, as expected.
2) Since the Pareto frontier (PR curve) is visited from bottom right (last detection, low precision, high recall) to top left (first detection, high precision, low recall), we need to work on delta_precision
instead of delta_recall
(summing areas of horizontal blocks, not vertical ones). The code I'm proposing to replace the current for loop is:
for (rank = detections_count - 2; rank >= 0; --rank)
{
if (pr[i][rank].precision > last_precision) {
double delta_precision = pr[i][rank].precision - last_precision;
last_precision = pr[i][rank].precision;
avg_precision += pr[i][rank].recall * delta_precision;
}
}
Otherwise the maths is not right.
3) To confirm what I'm saying we can use the calculation of mAP for COCO, but with more than 101 points. When we increase the number of points (from 101 to infinity), we are calculating the Riemann integral of the PR curve with progressively increasing accuracy. So, the resulting mAP should converge towards the AUC. In particular with a lot of points (I used -points 100001
) it should get really close.
In the current code, it doesn't, at least in my test case.
By replacing
double last_recall = pr[i][detections_count - 1].recall;
double last_precision = pr[i][detections_count - 1].precision;
for (rank = detections_count - 2; rank >= 0; --rank)
{
double delta_recall = last_recall - pr[i][rank].recall;
last_recall = pr[i][rank].recall;
if (pr[i][rank].precision > last_precision) {
last_precision = pr[i][rank].precision;
}
avg_precision += delta_recall * last_precision;
}
with the code I'm proposing
double last_recall = pr[i][detections_count - 1].recall;
double last_precision = pr[i][detections_count - 1].precision;
avg_precision += last_recall * last_precision;
for (rank = detections_count - 2; rank >= 0; --rank)
{
if (pr[i][rank].precision > last_precision) {
double delta_precision = pr[i][rank].precision - last_precision;
last_precision = pr[i][rank].precision;
avg_precision += pr[i][rank].recall * delta_precision;
}
}
we do get equal map calculation for -points 0 and points -100001.
@mmartin56 Hi,
Then, precision = 1 and recall = 1 for that point, so AUC should be 1.
Why?
This is just 1 point with X=recall Y=Precision
Please, draw this Precision-Recall-curve.
About 2 and 3 may be you are right, I will think more.
Assume there's one object in the image say 0 0.5 0.5 0.25 0.25.
Assume there's one perfect detection on that image: 0 0.5 0.5 0.25 0.25. All other predictors have a score of 0.
Then TP = 1, FP = 0, FN = 0. Do you agree that precision = 1 and recall = 1 for that point?
I took that from https://classeval.wordpress.com/introduction/introduction-to-the-precision-recall-plot/
The red curve (should go all the way down to the x-axis; not sure what they mean by baseline) is the PR curve of a perfect classifier, with only one point at coordinates (1,1).
Yes, you are right, mAP-calculation uses extrapolation
precision=0
for the curve segment recall>point_lowest _precision
precision=higherst_precision_point
for the curve segment recall<point_highest _precision
No worries!
@AlexeyAB Hi AlexeyAB, do you have any plan to change this code in validate_detector_map()? I have tested both code and get same mAP on different 3 datasets (from 1k to 3k images), and the result of -points 0 and -points 100001 is also the same. So how do you think about this code? Is it right? And need we change to this?
Hi Alexey,
In detector.c > validate_detector_map, line 1165, the calculation of average precision for 'Area under curve' only starts at the second last detection, and not the last one. If there's only one detection, does it mean the average precision will be zero?
Cheers Martin