Help me understand the mAP calculations

user1103 commented 7 years ago

In the code, we have this particular snippet: if ovmax > ovthresh: if not R['difficult'][jmax]: if not R['det'][jmax]: tp[d] = 1. R['det'][jmax] = 1 else: fp[d] = 1. else: fp[d] = 1.

From my understanding of the code, average precision is calculated by going down the entire list of proposals for ALL test images and marking each of them as TP or FP right?

What is this R['det'][jmax] = 1 then? Does it mean that if in one image, we have multiple proposals for a particular object that is considered truly classified, we only count one proposal as TP, and the others as FP?

ArturoDeza commented 7 years ago

Does it mean that if in one image, we have multiple proposals for a particular object that is considered truly classified, we only count one proposal as TP, and the others as FP?

The above statement is correct. The PASCAL VOC has those rules (See the paragraphs and section just before 4.2.1 in the following paper: http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf ). They mentioned it is the responsibility of each team to use non maximal suppression to prevent from making multiple detections. *A human would not say there are two people confidently if there is actually one.

As for the actual code, I would suggest mentioning what file you copied and pasted it from.

user1103 commented 7 years ago

The code is from here: https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/datasets/voc_eval.py

@ArturoDeza Will NMS completely remove all multiple detections? I think what it does is remove detections that are overlapping and that exceeds a certain threshold right?

BenjaminTT commented 7 years ago

If you look into the py_cpu_nms.py, you can see quite well the idea. It removes the detections of the same class that shares an IOU above a threshold (0.3 by default) by keeping the one with the best score.

I have trouble understanding precisely how does the calculation of the AP works too. If we look at the function im_detect, the output is one box and score per areas on interest (up to 300 if we consider ZF model), then the NMS kicks in order to reduce the chances of detecting several times 1 object (IOU thresh is 0.3 by default), after that the number of detection is reduced to a maximum (100det/image by default). Finally the detections are stored in a file. When I check the actual number of detections per image it always reaches the threshold (100) and when I look into the calculation of the AP, I did not find anything about discarding some a these detections. Now to put my issue in context, I am using an alternative version of the SUNRGBD dataset that I shaped like the pascal VOC dataset. When I train on color, I obtain a mAP close to 0.32 but I do not really understand why since, if there is 100 detections considered per pictures, the number of false positive should be overwhelming. Can someone provide some explanations or point where to look at? Thanks My objective being to use two version of the faster RCNN, one train on RGB and on Depth and try to achieve a fusion so that they would complete each other.

ZHANGKEON commented 6 years ago

However, set R['det'][jmax] = 1 will not change the value in class_recs[image_ids[d]]. Hence, for the next same image, doing R = class_recs[image_ids[d]] will give a new R which has R['det'][jmax] = 0. Is this understanding right?

rbgirshick / py-faster-rcnn

Help me understand the mAP calculations #484