About rr loss - Githubissues

Hi, thanks for your wonderful work. I have some problems about the rr loss. As is explained in th main paper, the rr loss is divided into two parts: matched pixels and unmatched pixels. However, I find the loss of mathed pixels also includes unmatched pixels:

def _compute_gt_location(self, scale: int, sampled_cols: Tensor, sampled_rows: Tensor,
                             attn_weight: Tensor, disp: Tensor):

        Find target locations using ground truth disparity.
        Find ground truth response at those locations using attention weight.
        :param scale: high-res to low-res disparity scale
        :param sampled_cols: index to downsample columns
        :param sampled_rows: index to downsample rows
        :param attn_weight: attention weight (output from _optimal_transport), [N,H,W,W]
        :param disp: ground truth disparity
        :return: response at ground truth location [N,H,W,1] and target ground truth locations [N,H,W,1]

        # compute target location at full res
        _, _, w = disp.size()
        pos_l = torch.linspace(0, w - 1, w)[None,].to(disp.device)  # 1 x 1 x W (left)
        target = (pos_l - disp)[..., None]  # N x H x W (left) x 1

        if sampled_cols is not None:
            target = batched_index_select(target, 2, sampled_cols)
        if sampled_rows is not None:
            target = batched_index_select(target, 1, sampled_rows)
        target = target / scale  # scale target location

        # compute ground truth response location for rr loss
        gt_response = torch_1d_sample(attn_weight, target, 'linear')  # NxHxW_left

        return gt_response, target

The gt_responese does not exclude unmathed pixels by occ_mask. Is this a bug? I think this may influence the predition of occlusion pixels.

mli0603 / stereo-transformer

About rr loss #86