Helmholtz-AI-Energy / TBBRDet

Thermal Bridges on Building Rooftops Detection (TBBRDet)
BSD 3-Clause "New" or "Revised" License
16 stars 6 forks source link

Error when testing ` assert channels in [1, 3] `? #3

Open aliwaqas333 opened 1 year ago

aliwaqas333 commented 1 year ago

The main reason is that the channels in TBBR dataset are 5 as compared to RGB image. How did the original authors make it to work?

load checkpoint from local path: /app/work_dirs/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco/epoch_23.pth
[                                                  ] 0/203, elapsed: 0s, ETA:Traceback (most recent call last):
  File "/app/scripts/mmdet/test.py", line 261, in <module>
    main()
  File "/app/scripts/mmdet/test.py", line 227, in main
    args.show_score_thr)
  File "/app/mmdetection/mmdet/apis/test.py", line 38, in single_gpu_test
    imgs = tensor2imgs(img_tensor, **img_metas[0]['img_norm_cfg'])
  File "/usr/local/lib/python3.7/dist-packages/mmcv/image/misc.py", line 36, in tensor2imgs
    assert channels in [1, 3]
AssertionError

test command: python /app/scripts/mmdet/test.py /app/configs/mmdet/swin/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco.py /app/work_dirs/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco/epoch_23.pth --work-dir /app/work_dirs/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco --show --show-dir /app/work_dirs/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco/results --show-score-thr 0.1

Can you provide more information please? @kahn-jms

kahn-jms commented 1 year ago

To do this we made a custom file loading module, you can find it here: https://github.com/Helmholtz-AI-Energy/TBBRDet/blob/main/scripts/mmdet/numpy_loader.py

Then in the config file, you call that module in your pipeline instead of the usual LoadImageFromFile one, here's how we did it in the train pipeline for the swin for example: https://github.com/Helmholtz-AI-Energy/TBBRDet/blob/main/configs/mmdet/swin/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco.py#L56

Aside from that you need to set in_channels=5 in your model config: https://github.com/Helmholtz-AI-Energy/TBBRDet/blob/main/configs/mmdet/swin/mask_rcnn_swin-t-p4-w7_fpn_ms-crop-3x_coco.py#L32

And if you call Normalize as part of your training/testing pipeline you need to make sure there's 5 values in the normalization args, in our case you'll see we used the img_norm_cfg, which was defined in a common file here: https://github.com/Helmholtz-AI-Energy/TBBRDet/blob/main/scripts/mmdet/common_vars.py#L10

aliwaqas333 commented 1 year ago

Thank you for the reply. Since i am using the code in this github repository, all the settings as mentioned are already correct. I have trained it successfully but having issues when testing.

i am using the same code as mentioned in your response.

is there something i can share with you so that you can assist me in a better way? Would it be possible for you to share the command used for testing?

kahn-jms commented 1 year ago

Hi Ali, thanks for clarifying, I think the issue might be that we always called test.py script with the flag --launcher slurm which sets the flag distributed = True. See https://github.com/Helmholtz-AI-Energy/TBBRDet/blob/main/scripts/mmdet/test.py#L185

In your case, it's calling single_gpu_test which tries to convert to image format for showing/saving annotations (via tensor2imgs), and with distributed = True it calls multi_gpu_test which does not.

Can you try running your test command again adding the flag --launcher pytorch and see if that solves it?

aliwaqas333 commented 1 year ago

Thank you kahn-jms,

This finally worked when i set the --launcher pytorch in my command. I had to add a few more environ variable as follows at start of the code.

os.environ['RANK'] = '0'
os.environ['WORLD_SIZE'] = '1'
os.environ['MASTER_ADDR'] = 'localhost'
os.environ['MASTER_PORT'] = '29500'