[Contribution welcome] Reproduce COCO results

bowenc0221 commented 3 years ago

COCO dataset is now supported by commit bda4f64, however, the reproduced result with ResNet-50 backbone is still lower than the number in the Panoptic-DeepLab paper (34.2 PQ vs. 35.1 PQ).

I hypothesize this is due to the following reasons (mainly related to data pre-processing):

Random padding is used in the original paper, when image size is smaller than the cropsize after scale augmentation; but current implementation only pad bottom and right boundaries of the image.
The original implementation ignores crowd region in semantic segmentation branch, but current implementation does not ignore.
Need to further tune training parameter, e.g. I found using learning rate 0.0005 is slightly better than 0.001 in this implementation.

Please reply to this issue if you are interested in reproducing COCO results :)

moodlife commented 3 years ago

I wonder if you can put the trained R50 model on the website. Sincerely!

bowenc0221 commented 3 years ago

It is in README now.

Cityscapes: https://dl.fbaipublicfiles.com/detectron2/PanopticDeepLab/Cityscapes-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_90k_bs32_crop_512_1024_dsconv/model_final_23d03a.pkl COCO: https://dl.fbaipublicfiles.com/detectron2/PanopticDeepLab/COCO-PanopticSegmentation/panoptic_deeplab_R_52_os16_mg124_poly_200k_bs64_crop_640_640_coco_dsconv/model_final_dee2af.pkl

bowenc0221 commented 3 years ago

I found the bug that makes COCO result 1 PQ lower than our paper. You only need to change MAX_SIZE_TRAIN from 640 to 960 for COCO, and we can now get 35.5 PQ for COCO!

chaoyan1037 commented 2 years ago

Hi @bowenc0221, do you have configs for the MNV3 on COCO? It is reported as 30.0 PQ [val] in the paper. If not, do you know which MNV3 version is used?

bowenc0221 / panoptic-deeplab

[Contribution welcome] Reproduce COCO results #60