GOATmessi8 / RFBNet

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018
MIT License
1.41k stars 356 forks source link

inds = torch.nonzero(scores[:,j]>0.01).view(-1) is super time consuming. #10

Closed foreverYoungGitHub closed 6 years ago

foreverYoungGitHub commented 6 years ago

Hi, when I test the result, I found that even though other parts is pretty fast, the nms tims cost is pretty high.

In this case, I test the time cost step by step and found that inds = torch.nonzero(scores[:,j]>0.01).view(-1) is super time consuming. It will takes nearly 50ms per iteration in k40c.

Does anyone has any ideas about that?

foreverYoungGitHub commented 6 years ago

It seems that the torch.nonzero is the reason to cause the result.