kuangliu / pytorch-retinanet

RetinaNet in PyTorch
992 stars 250 forks source link

investigate why the ap not good in voc dataset #63

Open xytpai opened 5 years ago

xytpai commented 5 years ago

I'v heard many achieved 60+ mAP only use voc0712-trainval for training, and I got similar result in my experiment. The mAP during training are shown in the figure. I'v achieved 68 mAP with data augment (random resize crop) and use 641*641 as input. res I think the main reason is that the labeling of VOC dataset is not accurate enough. Since Retinanet takes all the pixels that are not near the target box as background, the labeling accuracy is important. But in VOC we found some objects are not labeled. res2

sunshine-zkf commented 5 years ago

I think the VOC dataset is not a problem and should be a problem with code. I tested the trained model, and all the pixels that are not near the target box.

xytpai commented 5 years ago

I think the VOC dataset is not a problem and should be a problem with code. I tested the trained model, and all the pixels that are not near the target box.

Thanks for reply, I examined the code carefully and found several differences from the official code and finally reached 76 map. There is nothing to do with voc. The most critical issue is that the author set the threshold to 0.5, which is actually 0.05.

07hyx06 commented 4 years ago

hi! What's your final loss in training dataset? Should i set CLS_THRESH=0.5 and LOC-THRESH=0.05? i get about 0.02 train loss but lots of bounding box appear in the test image :(

RenzhiDaDa commented 4 years ago

I agree with @xytpai. When I check the code ,i find the encode.py has the difference between the paper and the code.the paper sets the CLS_THRESH =0.05,but this code uses 0.5.