Closed ShirleySyh closed 1 week ago
box_iou_rotated
Hello, first you can check if the CUDA driver and PyTorch versions are compatible; then, you can print the input tensors of box_iou_rotated
to verify if the format of the input bounding boxes is correct. Finally, if all the above results are normal, I believe it might be a hardware issue.
box_iou_rotated
Hello, first you can check if the CUDA driver and PyTorch versions are compatible; then, you can print the input tensors of
box_iou_rotated
to verify if the format of the input bounding boxes is correct. Finally, if all the above results are normal, I believe it might be a hardware issue.
Thank you for your reply! I changed my conda environment to torch 1.13.0 and cuda117, and I haven't met the invalid configuration argument error since. Torch version lower than 1.13.0 would all encounter this problem on my machine. With the above problem solved, there is a memory leakage problem occurred during my training. I use one RTX 3090 with 24G, set batchsize to 2. The allocated memory will continue increasing during the training and the CUDA out of memory problem will occur. I am still trying to locate the code that caused this memory leakage. It would be of great help if you can give some suggestions. Thank you!
box_iou_rotated
Hello, first you can check if the CUDA driver and PyTorch versions are compatible; then, you can print the input tensors of
box_iou_rotated
to verify if the format of the input bounding boxes is correct. Finally, if all the above results are normal, I believe it might be a hardware issue.Thank you for your reply! I changed my conda environment to torch 1.13.0 and cuda117, and I haven't met the invalid configuration argument error since. Torch version lower than 1.13.0 would all encounter this problem on my machine. With the above problem solved, there is a memory leakage problem occurred during my training. I use one RTX 3090 with 24G, set batchsize to 2. The allocated memory will continue increasing during the training and the CUDA out of memory problem will occur. I am still trying to locate the code that caused this memory leakage. It would be of great help if you can give some suggestions. Thank you!
Hi, you can refer to https://github.com/Luo-Z13/pointobb/blob/main/environment.yml to check your environment.
Hi, thank you for your reply. I finally solve the problem I met. It turns out that I link the cuda path in the system environment to my afore-downloaded local cuda driver with version 11.3, which cause the incompatible problem with my downloaded PyTorch. With the change of the correct cuda driver, the code runs successfully. Thanks again for your excellent work!
Hi, thank you for your excellent work! There's an error occurred during my implementation of your code. When I train in DOTAv1.0 dataset with config file "configs2/pointobb/pointobb_r50_fpn_2x_dota10.py", there will always occur "RuntimeError: CUDA error: invalid configuration argument" during the training in 1st epoch. I think the problem is related to Line 352 in code PointOBB/mmdet/models/detectors/utils.py, that is box_iou_rotated function in ext_module,but I don't know if I'm right. Have you ever run into this error during your implementation? The details of the error and my virtual environment are attached below.Looking forward for your reply. Thank you! pointobb_out.txt