I have noticed that you just resize the pred_masks to the original image size:
if results.has("pred_masks"):
if results.pred_masks.shape[0]:
results.pred_masks = F.interpolate(input=results.pred_masks, size=results.image_size,mode="bilinear", align_corners=False).gt(0.5).squeeze(1)
but the input image has been padded to the max size of a batch, so I think in the post process, the pred_mask needs to be crop first, just like sem_seg_postprocess.
I have noticed that you just resize the pred_masks to the original image size: if results.has("pred_masks"): if results.pred_masks.shape[0]: results.pred_masks = F.interpolate(input=results.pred_masks, size=results.image_size,mode="bilinear", align_corners=False).gt(0.5).squeeze(1)
but the input image has been padded to the max size of a batch, so I think in the post process, the pred_mask needs to be crop first, just like sem_seg_postprocess.