myfavouritekk / T-CNN

ImageNet 2015 Object Detection from Video (VID)
MIT License
371 stars 152 forks source link

MAP higher than the number in your cvpr paper #13

Closed CHUNYUWANG closed 7 years ago

CHUNYUWANG commented 7 years ago

Hi Kang,

I just noticed that there is a big gap between the MAP in your code and CVPR paper (70%-ish vs 40%-ish)? Do you know what causes the difference?

myfavouritekk commented 7 years ago

@CHUNYUWANG Hi, the detector on the CVPR paper is a single model which had not been finetuned on the VID dataset (finetuned on the DET dataset though), while the results of this work is an ensemble of several detectors finetuned on the VID dataset. Hope this answers your question.

TerryLovesLife commented 7 years ago

@myfavouritekk Could you please share what are the detectors? Because it will be hard to reproduce your result without the model information to compute the scores of 300 boxes for 30 classes. I just use one detector, faster rcnn for example, then times 5 to compared with your 5 detectors. The result is poor.