Closed Jensen-Su closed 3 years ago
Are you implementing the multi-scale test for Detectron2 code? Please follow the resizing implementation in Detectron2 to properly resize images. Detectron2 does not use open-cv to process images and I'm sure cv2.resize
is the problem that causes degradation.
Are you implementing the multi-scale test for Detectron2 code? Please follow the resizing implementation in Detectron2 to properly resize images. Detectron2 does not use open-cv to process images and I'm sure
cv2.resize
is the problem that causes degradation.
Yes, I am using Detectron2, which uses PIL.Image.resize with mode Image.Bilinear to resize images. And my model was multi-scale trained using Detectron2 under default configs. With your suggestion, I made a comparison between the resize operations in PIL.Image with Image.Blinear and cv2 with cv2.INTER_LINEAR only to find small diffrences: |
scale0.5 | PQ | AP | IoU |
---|---|---|---|---|
cv2.resize | 57.2 | 27.9 | 77.2 | |
Image.resize | 57.0 | 27.4 | 77.0 |
The images were resized into shape [512, 1024] for inference, and results were interpolated to [1024, 2048] using F.interpolate
with mod='bilinear', align_corners=ture
for evaluation.
Are the degradations of 3 points in IoU and 8 points in AP reasonable compared to scale-1?
What is the PQ you got by running the following command?
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_90k_bs32_crop_512_1024_dsconv.yaml --eval-only MODEL.WEIGHTS /path/to/model_checkpoint INPUT.MIN_SIZE_TEST 512 INPUT.MAX_SIZE_TEST 1024
What is the PQ you got by running the following command?
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_90k_bs32_crop_512_1024_dsconv.yaml --eval-only MODEL.WEIGHTS /path/to/model_checkpoint INPUT.MIN_SIZE_TEST 512 INPUT.MAX_SIZE_TEST 1024
Here are the results I got by varying the INPUT
config :MIN_TEST_SIZE MAX_TEST_SIZE PQ IoU AP 512 1024 55.1 76.6 26.3 1024 2048 61.5 79.8 36.3 1536 3072 60.9 79.1 34.6
I got even lower performance: PQ=55.1, IoU=76.6, AP=26.3.
(The MIN_SIZE_TRAIN
setting is (512, 640, 704, 832, 896, 1024, 1152, 1216, 1344, 1408, 1536, 1664, 1728, 1856, 1920, 2048)
)
It seems that the dramatic performance degradation under scale 0.5 is inevitable?
which indicates that we shouldn't ensemble the scale 0.5?
Yes, if you only inference with the 0.5 scale the performance will drop for sure. But adding it to multi-scale testing can still help (i.e., averaging predictions from 0.5, 1.0, 2.0 scales etc.).
The training scale is for data augmentation, the purpose is different.
I performed evaluation on different scales respectively and got the following results:
where the scale
1
corresponds to1024x2048
and flipping is always added. The table shows that the performances at scales0.5
and2
are much worse.As expected, the table shows that adding scales
0.5
or2
or both leads to worse performances.All
interpolation
s for network outputs are setmod=bilinear, align_corners=ture
, and input image is resized usingcv2.resize(img, (scaled_w, scaled_h), interpolation=cv2.INTER_LINEAR)
.Here are my confusions: