dvlab-research / PanopticFCN

Fully Convolutional Networks for Panoptic Segmentation (CVPR2021 Oral)
Apache License 2.0
391 stars 53 forks source link

reproducing results #31

Closed Yifeifr closed 2 years ago

Yifeifr commented 2 years ago

I have some questions about your responses to other closed issue #21 as below: The model you provided is about 290mb, however the model provided in this repo is about 140mb. What are the differences between the above two models? I also managed to train the model which saved a 140mb checkpoint file. In summary, I can not obtain the same results using your provided model for Cityscapes dataset. Do I miss something?


Hi, I have implement with_instance to Detectron2 as given in the attachment. However, I found that the performance on Detectron2 is 59.4 PQ [model] [metrics] (CLIP_VALUE=5.0). I'm not sure if it is the reason for the difference in the platform. Whatever, I'll change the performance with R50 to 59.4 PQ in the revision.

# please add and declare this function to detectron2/data/transforms/augmentation_impl.py
# use this augmentation in your own dataset_mapper
class RandomCropWithInstance(RandomCrop):
    """
    Make sure the cropping region contains the center of a random instance from annotations.
    """
    def get_transform(self, image, boxes=None):
        # boxes: list of boxes with mode BoxMode.XYXY_ABS
        h, w = image.shape[:2]
        croph, cropw = self.get_crop_size((h, w))

        assert h >= croph and w >= cropw, "Shape computation in {} has bugs.".format(self)
        offset_range_h = max(h - croph, 0)
        offset_range_w = max(w - cropw, 0)
        # Make sure there is always at least one instance in the image
        assert boxes is not None, "Can not get annotations infos."
        if len(boxes) == 0:
            h0 = np.random.randint(h - croph + 1)
            w0 = np.random.randint(w - cropw + 1) 
        else:
            rand_idx = np.random.randint(0, high=len(boxes))
            bbox = torch.tensor(boxes[rand_idx])
            center_xy = (bbox[:2] + bbox[2:]) / 2.0
            offset_range_h_min = max(center_xy[1] - croph, 0)
            offset_range_w_min = max(center_xy[0] - cropw, 0)
            offset_range_h_max = max(min(offset_range_h, center_xy[1] - 1), offset_range_h_min)
            offset_range_w_max = max(min(offset_range_w, center_xy[0] - 1), offset_range_w_min)

            h0 = np.random.randint(offset_range_h_min, offset_range_h_max + 1)
            w0 = np.random.randint(offset_range_w_min, offset_range_w_max + 1)
        return CropTransform(w0, h0, cropw, croph)

Originally posted by @yanwei-li in https://github.com/dvlab-research/PanopticFCN/issues/21#issuecomment-887167971

yanwei-li commented 2 years ago

Hi, the difference in model size is attributed to that we only keep the model parameters and pop other parameters (like 'optimizer', 'scheduler', 'iteration') in this repo. But we keep all the parameters in this model. You can find this if the .pth file is loaded. BTW, I'm not sure the meaning of "obtain the same results using your provided model". You can not get the result 59.4 PQ with this model in Cityscapes dataset?

Yifeifr commented 2 years ago

Thanks for your reply, I evaluated the model you provide(model_final.pth) with the code mentioned in issue 21 by DdeGeus, it seems weird that I got 58.2/50.1/64.0 for all/Things/Stuff in PQ on Cityscapes dataset, do I miss something?

yanwei-li commented 2 years ago
Hi, can you share your config parameters? I have tried the this fille and get the same result PQ SQ RQ #categories
All 59.363 80.056 72.975 19
Things 51.422 78.983 64.838 8
Stuff 65.139 80.836 78.893 11
Yifeifr commented 2 years ago

Thanks for reply.

We tried to use your repo on coco dataset. While we found that the PQ_th is a little better without using kernel fusion. By using your pretrained model associated with PanopticFCN-R50-1x.yaml, the results are: PQ SQ RQ #categories
All 40.250 79.595 48.913 133
Things 46.867 81.829 56.399 80
Stuff 30.261 76.222 37.613 53

while PQ_th is 46.812 with kernel fusion using your same pretrained model. We noticed that this does not match the results in your paper (Tab.4).

yanwei-li commented 2 years ago

You are right. It seems Detectron2 achieves better results in PQ_th without kernel fusion. But the AP without kernel fusion is much worse than that equipped with it, which could be attributed to the duplicate instances.