lkeab / gaussian-grouping

[ECCV'2024] Gaussian Grouping for open-world Anything reconstruction, segmentation and editing.
https://arxiv.org/abs/2312.00732
Apache License 2.0
500 stars 37 forks source link

How to calculate mIoU for the origin Lerf model? #29

Open wb014 opened 3 months ago

wb014 commented 3 months ago

Hi, thanks for the work. I see the metric table in the paper, and it puzzles me to render relevancy map for Lerf, because Lerf doesn't provide how to render it. Would you share the corresponding script?

Pine-sha commented 1 month ago

script/eval_lerf_mask.py:

    def boundary_iou(gt, dt, dilation_ratio=0.02):
        """
        Compute boundary iou between two binary masks.
        :param gt (numpy array, uint8): binary mask
        :param dt (numpy array, uint8): binary mask
        :param dilation_ratio (float): ratio to calculate dilation = dilation_ratio * image_diagonal
        :return: boundary iou (float)
        """
        dt = (dt>128).astype('uint8')
        gt = (gt>128).astype('uint8')
        gt_boundary = mask_to_boundary(gt, dilation_ratio)
        dt_boundary = mask_to_boundary(dt, dilation_ratio)
        intersection = ((gt_boundary * dt_boundary) > 0).sum()
        union = ((gt_boundary + dt_boundary) > 0).sum()
        boundary_iou = intersection / union
        return boundary_iou

and

    def calculate_iou(mask1, mask2):
        """Calculate IoU between two boolean masks."""
        mask1_bool = mask1 > 128
        mask2_bool = mask2 > 128
        intersection = np.logical_and(mask1_bool, mask2_bool)
        union = np.logical_or(mask1_bool, mask2_bool)
        iou = np.sum(intersection) / np.sum(union)
        return iou

when Calculate IoU and boundary_iou between two boolean masks,why need the mask rgb value >128 ?

when get boolean masks, set the mask rgb value >0,may be more reasonable?

lkeab commented 1 month ago

Hi, thanks for the question. For GT masks, there are only two values, 0 and 255, so >0 and >128 are equivalent. For 'pred', it is interpolated to the original resolution, so the values are continuous between 0 and 255; at this time, 128 is generally chosen. It's like in instance segmentation, when mask values are between 0 and 1, >0.5 is typically taken as positive.