AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.8k stars 7.97k forks source link

The problem of AP and mAP. #1139

Closed LiXirong closed 6 years ago

LiXirong commented 6 years ago

When I calculate the mAP on my own data set (60 classes), I find that the 'threshold' is useless on the AP of every single class. I set 'threshold'=0.1 0.2 0.4, but all the AP and mAP results are same.

My command: darknet.exe detector map data/xxx.data cfg/xxx.cfg backup/xxx_29500.weights -thresh 0.4

Results: / ... for thresh = 0.10, precision = 0.69, recall = 0.40, F1-score = 0.51 for thresh = 0.10, TP = 42954, FP = 19662, FN = 64007, average IoU = 49.96 % mean average precision (mAP) = 0.142481, or 14.25 % /

/ ... for thresh = 0.20, precision = 0.77, recall = 0.20, F1-score = 0.32 for thresh = 0.20, TP = 21557, FP = 6444, FN = 85404, average IoU = 57.09 % mean average precision (mAP) = 0.142481, or 14.25 % /

/ ... for thresh = 0.40, precision = 0.82, recall = 0.04, F1-score = 0.08 for thresh = 0.40, TP = 4576, FP = 999, FN = 102385, average IoU = 61.84 % mean average precision (mAP) = 0.142481, or 14.25 % /

AlexeyAB commented 6 years ago

AP and mAP doesn't depend on threshold (and probability). That is why mAP and AP are used in most competitions and ratings.

LiXirong commented 6 years ago

@AlexeyAB Thx! 👍

PranjalBiswas commented 6 years ago

@AlexeyAB ,

I dont know if I am correct or not, but if I assume IoU threshold of 1, the number of TP will be almost negligible, thus will lead a very low AP, hence a low mAP. So this kind of contradicts what you have mentioned that mAP is not affected by IoU threshold. Although I have not gone through the maths behind so I cant be sure, but there seems to be something wrong. Please clarify this issue.

Kind Regards Pranjal Biswas

AlexeyAB commented 6 years ago

@PranjaLBiswas27 mAP is calculated as average AP that is average Precision for 11 points on Precisions-Recall curve that is built for each value of threshold from 0.0 to 1.0

Read more:

PranjalBiswas commented 6 years ago

@AlexeyAB , I went through the above link and from that I do understand that AP is calculated over 11 recall points in the precision recall curve. But the precision recall curve is calculated for a specific IoU threshold(30% for the above link). So if the IOU is changed the precision-recall curve itself will change. Thus, changing the AP as well, and hence the mean AP(mAP). I am sorry if I am being naive here to understand the issue, but I am not yet convinced with your claim. I still believe that changing IOU threshold should change the mAP score. In fact increasing IOU threshold should decrease mAP while decreasing IOU should increase mAP.

Also, just to assume that IoU threshold does not change mAP, in that case a IoU threshold of even 0% will give a high mAP. Thus signifying that even though the predicted bounding boxes hardly overlap with ground truth boxes, we have a high mAP which cant be.

One more thing that struck me is that the "-thresh" command in your repository actually is confidence threshold of a predicted bounding box and not the IoU threshold to calculate the AP. So yes that wont affect mAP in any way. But IoU threshold should.

Please let me know your views on this, or I am making a mess of all the concepts. :D

AlexeyAB commented 6 years ago

@PranjaLBiswas27

Oh, you are talking about IoU-threshold instead of Probability-threshold, I missed this. Yes, mAP depends on IoU-threshold, and doesn't depend on Probability-threshold.


MSCOCO mAP (AP50): http://cocodataset.org/#detection-eval

APIoU=.50% AP at IoU=.50 (PASCAL VOC metric)


Pascal VOC mAP - Page 11: http://homepages.inf.ed.ac.uk/ckiw/postscript/ijcv_voc09.pdf

To be considered a correct detection, the area of overlap ao between the predicted bounding box Bp and ground truth bounding box Bgt must exceed 0.5 (50%) by the formula


PranjalBiswas commented 6 years ago

@AlexeyAB I as well initially confused "thresh" parameter with IoU threshold, rather than probability threshold. Later I realized the difference. I feel, the naming of parameter as just "thresh" is a bit confusing there. But anyway it clears my queries now. Thanks a lot.