open-mmlab / mmrotate

OpenMMLab Rotated Object Detection Toolbox and Benchmark
https://mmrotate.readthedocs.io/en/latest/
Apache License 2.0
1.84k stars 541 forks source link

oriented_rcnn_r50_fpn_1x_dota_le90.py 此模型跑推理用0卡和CPU推理正常,用1号或其他卡推理错误;其他模型没有问题 #961

Open sdnusqy-art opened 10 months ago

sdnusqy-art commented 10 months ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda-11.3 NVCC: Cuda compilation tools, release 11.3, V11.3.109 GCC: gcc (Ubuntu 8.4.0-3ubuntu2) 8.4.0 PyTorch: 1.12.1 PyTorch compiling details: PyTorch built with:

TorchVision: 0.13.1 OpenCV: 4.8.0 MMCV: 1.7.0 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.3 MMRotate: 0.3.4+

Reproduces the problem - code sample

python demo/image_demo.py demo/dota_demo.jpg configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth --output-file ./output/1.jpg --device 'cuda:1'

使用cuda:0或者cpu没问题, 使用cuda:1或者cuda:2等推理无检测结果输出

Reproduces the problem - command or script

python demo/image_demo.py demo/dota_demo.jpg configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth --output-file ./output/1.jpg --device 'cuda:1'

使用cuda:0或者cpu没问题, 使用cuda:1或者cuda:2等推理无检测结果输出

Reproduces the problem - error message

使用cuda:1推理, 定位到mmcv/ops/roi_align_rotated.py:64 ext_module.roi_align_rotated_forward返回output为错误结果, 输入参数与cuda:0相同, 此模块调用python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so 无法跟踪;

Additional information

No response

lzJune commented 6 months ago

你好,我遇到了同样的问题,请问解决了吗,这种情况影响后续正常的训练吗