Media-Smart / vedadet

A single stage object detection toolbox based on PyTorch
Apache License 2.0
498 stars 128 forks source link

The implementation of Box Voting #13

Closed zehuichen123 closed 3 years ago

zehuichen123 commented 3 years ago

Hi, it seems that box voting code is not implemented in this repo. Could you give some details on how you perform box voting over 10,000 (2(flip) 4(500,800,1100,1400,1700) * 4(shift)) boxes at most with box voting? It seems that batched NMS(lb-nms in this codebase) is not fit for box voting. Thanks!

hxcai commented 3 years ago

@zehuichen123 Box voting is a substitute for nms, and we follow the official paper to implement it. You can refer to PyramidBox or RetinaFace for details.

zehuichen123 commented 3 years ago

Thanks for your response! Do you perform NMS all together or, each scale separately and then NMS again? If you do it together, how can you perform such large IoU matrix computation with so many candidate boxes. I notice that you do NMS every 10,000 boxes and then concatenate them together to avoid core dump. So for box voting, you still use this strategy?

hxcai commented 3 years ago

@zehuichen123 We perform nms or box vote for all scales together. And the memory problem is caused by iou calculating with all bboxes in one time. You can refer to PyramidBox or RedinaFace for implementation details.

zehuichen123 commented 3 years ago

I see! I'll give it a try.

zehuichen123 commented 3 years ago

Hi, sorry for bothering you again. I reimplemented TTA yesterday but only get to about 93.2 on hard (without shift). Here are some questions. The title of this issue may be misleading since I am not wondering about the algorithm of box voting but how you perform box voting over millions of boxes. Do you still use NMS every 10_0000 and then concatenate them together to avoid OOM? In my experiments, this may lead to a drop on hard. set. Another question is about shift and resize. Which one do you perform first?

hxcai commented 3 years ago

@zehuichen123 If you use box voting, then do not use nms. And we perform resize first and then shift.

zehuichen123 commented 3 years ago

Yes, NMS here just refers to box voting. But how you compute box voting over so many boxes...

hxcai commented 3 years ago

@zehuichen123 TTA dose take much time.

noranartv commented 3 years ago

@hxcai Could you consider releasing TTA code?

hxcai commented 3 years ago

@noranartv Recently we have no plan to release the tta code because it is easy to implement and it really cost time to run.