Open WeihongM opened 4 years ago
Hey, that's interesting. I remember I've tested the cuda version and it works fine, maybe I've missed something. Lately I don't have time to make adjustments to this repo, so if you do find a fix, pull requests are very welcome!
As for GPU being slower than CPU, yea it's likely because your array is small, so more time is spent in CPU-to-GPU-to-CPU memory allocation than computation
@mrlooi Hello, thanks for your reply. I find the problem is caused mainly by the float precision, maybe you will get inspirations from this point. I am not familiar to cuda code, hope you can renew this file. Maybe you can refer to the detectron2 project. reference link Hope you can update the files. Thanks very much.
Hey thanks for sharing the detectron2, it looks good. I remember fixing the code to handle those floating precision errors, and even so the results shouldn't be that different. Will have a look at it when I find the time. Thanks again for pointing it out
Hello, Thanks for your reply. As the cpu rotate_iou is real slow speed, @mrlooi Sorry, do you have time to renew this file these days, because I working with this project recently. Looking forward to your updates recently. Thanks.
I don't unfortunately, but you can still use the code as it is. I tested the code, and the results are different from cpu only if the numbers are exactly the same, just like in your example. Remove or change any one decimal number and it'll work
You can add a simple if condition to check if the numbers are the same, then return 1.0
@mrlooi Thanks for your reply. I have used rotated iou script in detectron2 project. successfully ported to your project.
excuse me, may I ask one more question.
I have read your code, and it seems you have do some tricks in this project. such as dir maskrcnn_benchmark/rrpn/inference.py function add_gt_proposals, you defind aug_rrect, rather than directly use gt_rrect, do you find this tricks matters?
Also, the way you compute angle loss is cos function, rather than smooth loss, which seems different from RRPN paper. Do you find this also matters?
Also, in the file maskrcnn_benchmark/modeling/rrpn/loss.py, the call function has the comments old and new. Do you forget to uncomment this, or maybe these comments is redundant.
looking forward to your reply. thanks
1) augmented gt_proposals - I did this to augment the shape of the rotated boxes. The motivation was to increase segmentation robustness to differences in rotated box width + height, since there is more variability in rotated boxes than bounding boxes 2) cos function - I found a very slight AP improvement using cos (+0.02 AP). But this may vary. Bottom line: I found somewhat negligible impact, and marginal improvement at best 3) Comments: those are just there for reference
🐛 Bug
however, when I test on the cpu mode, It return the result, tensor=1.0 Quite surprised to debug the result, maybe the rotate_iou cuda version is wrong? Whats more, I test the rotated ops file (maskrcnn_benchmark/modeling/rotate_ops.py) the rotated version used in GPU mode is slower than CPU mode. It's because the test box array is small? @mrlooi , hope for your updates, Thanks