KexianHust / Structure-Guided-Ranking-Loss

Structure-Guided Ranking Loss for Single Image Depth Prediction
Other
184 stars 18 forks source link

multi-scale scale-invariant gradient matching loss in inverse depth space #13

Closed ewrfcas closed 3 years ago

ewrfcas commented 3 years ago

Thanks for the good work! I have some questions about the multi-scale scale-invariant gradient matching loss in inverse depth space. Here is the code in ref[22], but I think it is different from the grad_loss used in this method.

def GradientLoss(self, log_prediction_d, mask, log_gt):
        N = torch.sum(mask)
        log_d_diff = log_prediction_d - log_gt
        log_d_diff = torch.mul(log_d_diff, mask)

        v_gradient = torch.abs(log_d_diff[0:-2, :] - log_d_diff[2:, :])
        v_mask = torch.mul(mask[0:-2, :], mask[2:, :])
        v_gradient = torch.mul(v_gradient, v_mask)

        h_gradient = torch.abs(log_d_diff[:, 0:-2] - log_d_diff[:, 2:])
        h_mask = torch.mul(mask[:, 0:-2], mask[:, 2:])
        h_gradient = torch.mul(h_gradient, h_mask)

        gradient_loss = torch.sum(h_gradient) + torch.sum(v_gradient)
        gradient_loss = gradient_loss / N

        return gradient_loss

I think log is not used here and what is the 'inverse depth space'? Besides, what is the gradient interval? In the codes above, the interval is 2 in all scales. Thanks!

Update: this loss works not well in my re-implemented method, which reduces the performance.

KexianHust commented 3 years ago

In my opinion, the input (log_prediction_d) to the GradientLoss is the log depth. In our method, we compute the losses in the inverse depth space. The gradient interval is 1 on each scale.

ewrfcas commented 3 years ago

Thanks for the reply! Here are still some questions.

  1. The gt depth Is ranged from 0 to 1. 0 is the farthest (many 0 depths are out of the mask) and 1 is the nearest. So log(0) will cause errors.
  2. Besides, the model output has no constraint to the predicted depth (preds), so when preds<=0, log(preds) will cause errors.
  3. If I understand correctly, the inverse depth space means inv_depth=1/depth. But 1/0 also causes errors.

Should all depths be clipped to be larger than a min-depth such as 1e-3? (but 1/1e-3=1000 I think it is still a bad score for the model training)

ewrfcas commented 3 years ago

I would appreciate it if you could provide or explain some necessary data preprocessing.

KexianHust commented 3 years ago

Sorry for the late reply. As we mentioned in our paper, we only compute the losses of valid pixels (gt_depth > 0). Note that, we compute the losses in inverse depth space, NOT the log one.

ewrfcas commented 3 years ago

Thanks. Since the disparity has indicated inverse depth space, should I inverse the disparity for the grad loss?

KexianHust commented 3 years ago

No, you don't have to.

ewrfcas commented 3 years ago

Thanks for your kind advice!