mrlooi / rotated_maskrcnn

Rotated Mask R-CNN: From Bounding Boxes to Rotated Bounding Boxes
MIT License
347 stars 62 forks source link

Bug: rotate_iou CUDA mode #9

Open WeihongM opened 4 years ago

WeihongM commented 4 years ago

🐛 Bug

from maskrcnn_benchmark.layers.rotate_nms import rotate_iou
a=torch.tensor([[50.7325,50.7325,100.7325,300.7325,-3.4]]).float().to('cuda')
b=torch.tensor([[50.7325,50.7325,100.7325,300.7325,-3.4]]).float().to('cuda')
rotate_iou(a,b)     # get the result 0.3333
a=torch.tensor([[50.7325,50.7325,100.7325,300.7325,-3.4]]).float()
b=torch.tensor([[50.7325,50.7325,100.7325,300.7325,-3.4]]).float()
rotate_iou(a,b)   # get the result 1.0

however, when I test on the cpu mode, It return the result, tensor=1.0 Quite surprised to debug the result, maybe the rotate_iou cuda version is wrong? Whats more, I test the rotated ops file (maskrcnn_benchmark/modeling/rotate_ops.py) the rotated version used in GPU mode is slower than CPU mode. It's because the test box array is small? @mrlooi , hope for your updates, Thanks

mrlooi commented 4 years ago

Hey, that's interesting. I remember I've tested the cuda version and it works fine, maybe I've missed something. Lately I don't have time to make adjustments to this repo, so if you do find a fix, pull requests are very welcome!

As for GPU being slower than CPU, yea it's likely because your array is small, so more time is spent in CPU-to-GPU-to-CPU memory allocation than computation

WeihongM commented 4 years ago

@mrlooi Hello, thanks for your reply. I find the problem is caused mainly by the float precision, maybe you will get inspirations from this point. I am not familiar to cuda code, hope you can renew this file. Maybe you can refer to the detectron2 project. reference link Hope you can update the files. Thanks very much.

mrlooi commented 4 years ago

Hey thanks for sharing the detectron2, it looks good. I remember fixing the code to handle those floating precision errors, and even so the results shouldn't be that different. Will have a look at it when I find the time. Thanks again for pointing it out

WeihongM commented 4 years ago

Hello, Thanks for your reply. As the cpu rotate_iou is real slow speed, @mrlooi Sorry, do you have time to renew this file these days, because I working with this project recently. Looking forward to your updates recently. Thanks.

mrlooi commented 4 years ago

I don't unfortunately, but you can still use the code as it is. I tested the code, and the results are different from cpu only if the numbers are exactly the same, just like in your example. Remove or change any one decimal number and it'll work

You can add a simple if condition to check if the numbers are the same, then return 1.0

WeihongM commented 4 years ago

@mrlooi Thanks for your reply. I have used rotated iou script in detectron2 project. successfully ported to your project.

excuse me, may I ask one more question.

looking forward to your reply. thanks

mrlooi commented 4 years ago

1) augmented gt_proposals - I did this to augment the shape of the rotated boxes. The motivation was to increase segmentation robustness to differences in rotated box width + height, since there is more variability in rotated boxes than bounding boxes 2) cos function - I found a very slight AP improvement using cos (+0.02 AP). But this may vary. Bottom line: I found somewhat negligible impact, and marginal improvement at best 3) Comments: those are just there for reference