Closed CHUNYUWANG closed 7 years ago
@CHUNYUWANG Hi, the detector on the CVPR paper is a single model which had not been finetuned on the VID dataset (finetuned on the DET dataset though), while the results of this work is an ensemble of several detectors finetuned on the VID dataset. Hope this answers your question.
@myfavouritekk Could you please share what are the detectors? Because it will be hard to reproduce your result without the model information to compute the scores of 300 boxes for 30 classes. I just use one detector, faster rcnn for example, then times 5 to compared with your 5 detectors. The result is poor.
Hi Kang,
I just noticed that there is a big gap between the MAP in your code and CVPR paper (70%-ish vs 40%-ish)? Do you know what causes the difference?