Megvii-BaseDetection / OTA

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.
Apache License 2.0
242 stars 24 forks source link

Training time for dynamic k estimation vs fixed k #3

Closed QinghangHong1 closed 3 years ago

QinghangHong1 commented 3 years ago

Hi, thanks for your great work on OTA. I have tried to incorporate OTA into my model and found a huge difference in training time for dynamic k estimation vs fixed k settings. I am using 4 Tesla P100 to train. It takes about 2.5 days for fixed k while it takes more than 6 days to train for dynamic k estimation. I am wondering what your training time is like on both settings. Do you face similar issues? Thanks a lot!

Joker316701882 commented 3 years ago

@QinghangHong1 Hi, thank you for your interest in our work. In our experiments, there is no difference between fixed k and dynamic k on training time. I would suggest you try to use dynamic k only (without OT) to test the training speed. If it is OT that slows down the training speed, you can directly use dynamic k without OT, which still performs better than most existing label assigning strategies but with higher training efficiency compared to the full version of OTA.

QinghangHong1 commented 3 years ago

Thanks a lot for your reply. You are absolutely right! It was slow because I put topk operation on cpu instead of gpu.