pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.26k stars 6.96k forks source link

references/segmentation/coco_utils might require merging rles? #8661

Open crazyboy9103 opened 1 month ago

crazyboy9103 commented 1 month ago

https://github.com/pytorch/vision/blob/6d7851bd5e2bedc294e40e90532f0e375fcfee04/references/segmentation/coco_utils.py#L27-L41 Above seems to assume that objects are not occluded, not merging rles from frPyObjects. In such case, i think it must be changed to

rles = coco_mask.frPyObjects(polygons, height, width) 
rle = coco_mask.merge(rles)
mask = coco_mask.decode(rle)

Is there any specific reason for this, or am I wrong?

NicolasHug commented 1 month ago

Hi @crazyboy9103 , thanks for the report. I'm not so familiar with that part of the code-base so I could be way off, but I suspect the logic you're looking for is implemented later in https://github.com/pytorch/vision/blob/6d7851bd5e2bedc294e40e90532f0e375fcfee04/references/segmentation/coco_utils.py#L50-L56 ?