YoungXIAO13 / FewShotDetection

(ECCV 2020) PyTorch implementation of paper "Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild"
http://imagine.enpc.fr/~xiaoy/FSDetView/
MIT License
210 stars 33 forks source link

Cannot replicate the result after pretraining on COCO #19

Open Skaldak opened 3 years ago

Skaldak commented 3 years ago

Thanks for your outstanding work and remarkable results! However, after several attempts on different random seeds, we cannot replicate the result (10 shot mAP 12.5 and 30 shot mAP 14.7) after pretraining on COCO by ourselves. To our surprise, the result seems just fine when we use your pretrained model on COCO. Also, we find that you only use 20000+ base only images in pretraining phase, rather than images with base and novel ground truths. Is this for not recognizing novel classes as background in RPN? Could you maybe explain this?

YoungXIAO13 commented 3 years ago

Hi @Skaldak

Thanks for asking! I'm actually curious about the performance gap obtained by your retraining.

And could you please check this line is continue or break in your case? In the first commit, we set it to break in order to filter out all images containing novel classes, which results in the 20K+ images used for the base-class pre-training. Then, we realize that a better performance can be achieved simply by using all images containing base class while treating novel class as background (see Issue#5 in TFA). Therefore, we change it to continue.

Also, have you compiled all the lib files correctly in the suggested environments and completed the whole training of 20 epochs?

Skaldak commented 3 years ago

Thanks for your explanation. We trained everything using your hyper-parameters, 20 epochs included. We have transplanted the code to PyTorch 1.0 though, for compatible issues, still using all lib files compiled with PyTorch 0.4. Everything worked just fine and the fine-tuning upon your pretrained model also produced positive results (10 shot mAP 12.9 and 30 shot mAP 13.5 on COCO). The line you referred to is break in our testing version, yielding a 10 shot mAP ~7.3. We tried continue as well with mAP ~10.8. I'm curious about your results in each setting. Thanks again for your quick reply!

YoungXIAO13 commented 3 years ago

The results in paper are obtained with break, using continue could usually brings us ~2 points boost on COCO.

Training directly with PyTorch 1.0 with lib files complied with PyTorch 0.4 could potentially brings some unpredictable issues in my opinion, I'm not sure you could get the same results even though you can train with compiled files (see some discussion here)

Another thing to check is that the backbone weight initialization, in my case, I simply initialized it with the ImageNet pretrained ResNet-101 even though the actual backbone is ResNet-50 for COCO.

For reproducible concerns, we've recently retrained the base-class training using ResNet-101 as backbone with batch_size=2 and epochs=9, and a further performance boost ~2 points can be achieved for both 10 shots and 30 shots. The pre-trained weights would also be released soon.

john2020-210 commented 3 years ago

Does anyone replicate the result on pascal voc with pytorch>=1.0? I only get the map=10.8

xiexijun commented 2 years ago

Thanks for your explanation. We trained everything using your hyper-parameters, 20 epochs included. We have transplanted the code to PyTorch 1.0 though, for compatible issues, still using all lib files compiled with PyTorch 0.4. Everything worked just fine and the fine-tuning upon your pretrained model also produced positive results (10 shot mAP 12.9 and 30 shot mAP 13.5 on COCO). The line you referred to is break in our testing version, yielding a 10 shot mAP ~7.3. We tried continue as well with mAP ~10.8. I'm curious about your results in each setting. Thanks again for your quick reply!

May I use your code for reference? My device doesn't support CUDA 8.0. The result I obtained by using the author's code for reproduction is very poor, the accuracy of 10 shot is only 21.35%.