Hi, When I look the code in mmdet/models/dense_heads/yolact_head.py/
@HEADS.register_module()
class YOLACTSegmHead(nn.Module):
def get_targets(self, segm_pred, gt_masks, gt_labels):
"""Compute semantic segmentation targets for each image.
Args:
segm_pred (Tensor): Predicted semantic segmentation map
with shape (num_classes, H, W).
gt_masks (Tensor): Ground truth masks for each image with
the same shape of the input image.
gt_labels (Tensor): Class indices corresponding to each box.
Returns:
Tensor: Semantic segmentation targets with shape
(num_classes, H, W).
"""
if gt_masks.size(0) == 0:
return None
num_classes, mask_h, mask_w = segm_pred.size()
with torch.no_grad():
downsampled_masks = F.interpolate(
gt_masks.unsqueeze(0), (mask_h, mask_w),
mode='bilinear',
align_corners=False).squeeze(0)
downsampled_masks = downsampled_masks.gt(0.5).float()
segm_targets = torch.zeros_like(segm_pred, requires_grad=False)
for obj_idx in range(downsampled_masks.size(0)):
segm_targets[gt_labels[obj_idx] - 1] = torch.max(
segm_targets[gt_labels[obj_idx] - 1],
downsampled_masks[obj_idx])
return segm_targets
Why using gt_labels[obj_idx]-1? When using coco datasets provided by mmdet, the category label has converted to [0,79], I think there is no need to -1 here. Can someone explain it?
Hi, When I look the code in mmdet/models/dense_heads/yolact_head.py/
@HEADS.register_module() class YOLACTSegmHead(nn.Module): def get_targets(self, segm_pred, gt_masks, gt_labels): """Compute semantic segmentation targets for each image.
Why using gt_labels[obj_idx]-1? When using coco datasets provided by mmdet, the category label has converted to [0,79], I think there is no need to -1 here. Can someone explain it?