aim-uofa / AdelaiDepth

This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.
Creative Commons Zero v1.0 Universal
1.06k stars 144 forks source link

Question about the multi-scale gradient loss #15

Closed ewrfcas closed 3 years ago

ewrfcas commented 3 years ago

Thanks for the good work! I have some questions about the multi-scale gradient loss used in the paper. Here is the code of MegaDepth [24], but I think it is different from the grad_loss used in this method.

def GradientLoss(self, log_prediction_d, mask, log_gt):
        N = torch.sum(mask)
        log_d_diff = log_prediction_d - log_gt
        log_d_diff = torch.mul(log_d_diff, mask)

        v_gradient = torch.abs(log_d_diff[0:-2, :] - log_d_diff[2:, :])
        v_mask = torch.mul(mask[0:-2, :], mask[2:, :])
        v_gradient = torch.mul(v_gradient, v_mask)

        h_gradient = torch.abs(log_d_diff[:, 0:-2] - log_d_diff[:, 2:])
        h_mask = torch.mul(mask[:, 0:-2], mask[:, 2:])
        h_gradient = torch.mul(h_gradient, h_mask)

        gradient_loss = torch.sum(h_gradient) + torch.sum(v_gradient)
        gradient_loss = gradient_loss / N

        return gradient_loss

Should I use depth with 'log' without any constraint to make 'prediction_d'>0?

YvanYin commented 3 years ago
class MSGIL_NORM_Loss(nn.Module):
    """
    Our proposed GT normalized Multi-scale Gradient Loss Fuction.
    """
    def __init__(self, scale=4, valid_threshold=-1e-8, max_threshold=1e8):
        super(MSGIL_NORM_Loss, self).__init__()
        self.scales_num = scale
        self.valid_threshold = valid_threshold
        self.max_threshold = max_threshold
        self.EPSILON = 1e-6

    def one_scale_gradient_loss(self, pred_scale, gt, mask):
        mask_float = mask.to(dtype=pred_scale.dtype, device=pred_scale.device)

        d_diff = pred_scale - gt

        v_mask = torch.mul(mask_float[:, :, :-2, :], mask_float[:, :, 2:, :])
        v_gradient = torch.abs(d_diff[:, :, :-2, :] - d_diff[:, :, 2:, :])
        v_gradient = torch.mul(v_gradient, v_mask)

        h_gradient = torch.abs(d_diff[:, :, :, :-2] - d_diff[:, :, :, 2:])
        h_mask = torch.mul(mask_float[:, :, :, :-2], mask_float[:, :, :, 2:])
        h_gradient = torch.mul(h_gradient, h_mask)

        valid_num = torch.sum(h_mask) + torch.sum(v_mask)

        gradient_loss = torch.sum(h_gradient) + torch.sum(v_gradient)
        gradient_loss = gradient_loss / (valid_num + 1e-8)

        return gradient_loss

    def forward(self, pred, gt, minmax_meanstd):
        mask = gt > self.valid_threshold
        grad_term = 0.0
        gt_mean = minmax_meanstd[:, 2]
        gt_std = minmax_meanstd[:, 3]
        gt_trans = (gt - gt_mean[:, None, None, None]) / (gt_std[:, None, None, None] + 1e-8)
        for i in range(self.scales_num):
            d_gt = gt_trans[:, :, ::2, ::2]
            d_pred = pred[:, :, ::2, ::2]
            d_mask = mask[:, :, ::2, ::2]
            grad_term += self.one_scale_gradient_loss(d_pred, d_gt, d_mask)
        return grad_term

This is the multi-scale gradient loss that I used. I have revised the original formulation.

frischzenger commented 3 years ago
def forward(self, pred, gt, minmax_meanstd):
    mask = gt > self.valid_threshold
    grad_term = 0.0
    gt_mean = minmax_meanstd[:, 2]
    gt_std = minmax_meanstd[:, 3]
    gt_trans = (gt - gt_mean[:, None, None, None]) / (gt_std[:, None, None, None] + 1e-8)
    for i in range(self.scales_num):
        d_gt = gt_trans[:, :, ::2, ::2]#this value not change during for loop
        d_pred = pred[:, :, ::2, ::2] #this value not change during for loop
        d_mask = mask[:, :, ::2, ::2]#this value not change during for loop
        grad_term += self.one_scale_gradient_loss(d_pred, d_gt, d_mask)
    return grad_term

since d_gt, d_pred, d_mask is not change during for loop, the what's the meaning of repeat compute the one_scale_gradient_loss ? they all have same behavier

phamdat09 commented 2 years ago

Same question

YvanYin commented 2 years ago

Hi sorry for late response. @frischzenger you are right. It should be in your posted codes. This is a bug in my code.

phamdat09 commented 2 years ago

@YvanYin This bug is still on your release code.