facebookresearch / VLPart

[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation
MIT License
348 stars 16 forks source link

Need some explanation on a data preparation code. #2

Closed neouyghur closed 1 year ago

neouyghur commented 1 year ago

Hi, I found this part of the code confusing. Could you give some explanations? I put some comments that I didn't understand. Thanks.

    for ann in data['annotations']:
        segs = ann['segmentation']
        new_segs = []
        for seg in segs:
            assert len(seg) > 0 and len(seg) % 2 == 0
            if len(seg) < 4:
                new_segs.append(seg + [0, 0, seg[0], seg[1]]) # why do you add these?
            if len(seg) == 4:
                new_segs.append(seg + [seg[0], seg[1]])
            else:
                new_segs.append(seg)
            new_segs.append(seg) # Why do we need this line?
        ann['segmentation'] = new_segs
iranroman commented 1 year ago

I am also interested in learning more about this. Can someone help us?

PeizeSun commented 1 year ago

Hi, @neouyghur @iranroman Sorry for the late reply.

For if len(seg) < 4: new_segs.append(seg + [0, 0, seg[0], seg[1]]), this is because a polygon should have at least 3 points, aka 6 coordinates. Here is another reference.

For new_segs.append(seg), this is bug... Thanks for pointing it out. Luckily, it doesn't make effect since the duplicated polygon points won't change the mask.

neouyghur commented 1 year ago

@PeizeSun Could you briefly explain the purpose of pascal_part_one_json.py? thanks.

PeizeSun commented 1 year ago

@PeizeSun Could you briefly explain the purpose of pascal_part_one_json.py? thanks.

When we build the dense semantic correspondence between novel object and base object, we expect there is only one object in the image, otherwise the correspondence may be messed up. So we select those images that have only one object by pascal_part_one_json.py

neouyghur commented 1 year ago

@PeizeSun

For if len(seg) < 4: new_segs.append(seg + [0, 0, seg[0], seg[1]]), this is because a polygon should have at least 3 points, aka 6 coordinates. Here is another reference.

Thanks for your reply. However, I am a bit confused by the (0,0). If len(seg)<4, then it is only a point. If you are going to add (0,0), the mask will be a line. This might introduce some incorrect GT.

PeizeSun commented 1 year ago

@PeizeSun

For if len(seg) < 4: new_segs.append(seg + [0, 0, seg[0], seg[1]]), this is because a polygon should have at least 3 points, aka 6 coordinates. Here is another reference.

Thanks for your reply. However, I am a bit confused by the (0,0). If len(seg)<4, then it is only a point. If you are going to add (0,0), the mask will be a line. This might introduce some incorrect GT.

Yes, it will introduce incorrect GT. This error is caused by annotation. Under the incorrect annotation, we don't know what is the correct GT, unless we re-annotate. Therefore, this code is only used for not causing errors to following code.

neouyghur commented 1 year ago

@PeizeSun In this case, Will this if len(seg) < 4: new_segs.append(seg + [ seg[0], seg[1], seg[0], seg[1]]) be better?

PeizeSun commented 1 year ago

Probably. Maybe one point is better than one line.

neouyghur commented 1 year ago

@PeizeSun Thanks for your help. I close this issue.