ADE20k prepare pan_seg, two stuffs has the same color possibly result in miss calculation of results

zbwxp commented 3 years ago

Hi In the MaskFormer/datasets/prepare_ade20k_pan_seg.py In the PLAETTE the 7th [140,140,140] (road:route, stuff) and the 49th [140,140,140] (skyscraper, stuff) has the same PLAETTE color. They are both stuff. According to the Idgenerater() they will have the same segment_id. Resulting in both areas storing the same colour as panoptic annotation png and same "id" in json file. line 437

            segm_info.append(
                {
                    "id": int(segment_id),   # same for both skyscrapers and road
                    "category_id": int(semantic_cat_id),  # different categories
                    "area": int(area),
                    "bbox": bbox,
                    "iscrowd": 0,
                }
            )

While in the pan_seg data_mapper, it seems the mapper directly convert back the categories using rgb2id and use the "id" to do training afterwards. This probably will map road and skyscraper areas together to either road and skyscraper. (In training those two categories will have both areas as the ground truth. I am not sure what will happen when evaluation) line 106

    pan_seg_gt = rgb2id(pan_seg_gt)
    # some lines later ...
    for segment_info in segments_info:
        class_id = segment_info["category_id"]   # different categories
        if not segment_info["iscrowd"]:
            classes.append(class_id)
            masks.append(pan_seg_gt == segment_info["id"])  # same for both skyscrapers and road

I assume this may cause miss calculation in training and evaluation.

Could you please help me have a check of that?

bowenc0221 commented 3 years ago

Thanks for bringing up this point! I followed the PLAETTE for ADE20K in mmsegmentation without checking whether there are repeated colors or not.

This will indeed merge road:route and skyscraper into the same category for panoptic segmentation. I will re-run experiments on ADE20K panoptic segmentation and get back to you when I get new results.

bowenc0221 commented 3 years ago

@zbwxp Looks like this issue is not affecting the evaluation number: I generated the ADE20K panoptic annotation again fixing the issue with these colors, and evaluation with old checkpoint gave me the same PQ number.

I found this is because skyscraper only appears less than 10 times in the val set, so luckily this bug does not affect the performance. I will update the data preprocessing script ASAP.

Thanks again for finding this issue!

bowenc0221 commented 3 years ago

@zbwxp Fixed in 909e8a8. It does not affect the performance of released models. Please let me know if you find any problem.

facebookresearch / MaskFormer

ADE20k prepare pan_seg, two stuffs has the same color possibly result in miss calculation of results #32