Closed amobiny closed 3 years ago
@amobiny Hi, Did you modify the code? Can you test it with a single machine first? Thank you!
Hi @hhaAndroid The same error pops up even when running on 1 gpu. I was able to find the problem and resolve it. The reason is that sometimes when there is no object detected in an image, the mask_results in /mmdet/apis/test (i both single_gpu_test and multi_gpu_test) becomes [[[], [], []]] (which is a list of length 1), in a 3-class problem, for example. However, it should be [[], [], []] which is a list of length 3! This causes the error in /mmdet/core/mask/utils.py in the encode_mask_results function. I modified the encode_mask_results code and it works all good now.
Hi @hhaAndroid The same error pops up even when running on 1 gpu. I was able to find the problem and resolve it. The reason is that sometimes when there is no object detected in an image, the mask_results in /mmdet/apis/test (i both single_gpu_test and multi_gpu_test) becomes [[[], [], []]] (which is a list of length 1), in a 3-class problem, for example. However, it should be [[], [], []] which is a list of length 3! This causes the error in /mmdet/core/mask/utils.py in the encode_mask_results function. I modified the encode_mask_results code and it works all good now.
OK, Thank you, I will check it and fix.
Thanks for your error report and we appreciate it a lot.
Checklist
Describe the bug
I've successfully trained a Cascade-Mask RCNN model on my custom data. For testing, it works fine if I perform the inference on only one scale, but fails when performing the multi-scale test. The weird part is that it runs properly and breaks halfway through as you can see in the Traceback down here (not on the same image; it changes randomly every time and never reaches the end).
@yhcao6 I found https://github.com/open-mmlab/mmdetection/issues/2308 and https://github.com/open-mmlab/mmdetection/pull/2349/files which are closely related to this issue, but I think the error I'm getting is a completely different issue and happens when trying to convert the bitmap mask into RLE code in /mmdet/core/mask/utils.py
Reproduction
I ran
and here is a copy of my full config.py file
I've made no major change to the default Cascade-Mask RCNN model.
Environment
python mmdet/utils/collect_env.py
to collect necessary environment information and paste it here.fatal: Not a git repository (or any parent up to mount point /lhome1) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). sys.platform: linux Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0] CUDA available: True GPU 0,1,2,3: Quadro RTX 8000 CUDA_HOME: /usr/local/cuda-10.1 NVCC: Cuda compilation tools, release 10.1, V10.1.243 GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.7.0 PyTorch compiling details: PyTorch built with:
TorchVision: 0.8.0 OpenCV: 4.5.1 MMCV: 1.3.3 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMDetection: 2.11.0+
$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)Error traceback If applicable, paste the error trackback here.