Megvii-BaseDetection / BEVDepth

Official code for BEVDepth.
MIT License
710 stars 98 forks source link

Why to compute depth predict resultt is different from your paper ablation experiment? #58

Open Tony-Hou opened 2 years ago

Tony-Hou commented 2 years ago

`def safe_log10(x, eps=1e-10): result = np.where(x > eps, x, -10) np.log10(result, out=result, where=result > 0) return result

def safe_log(x, eps=1e-5): return np.log(x+eps)

def calculate(gt, pred): if gt.shape[0] == 0: return np.nan, np.nan, np.nan, np.nan, np.nan, np.nan

#thresh = np.maximum((gt / pred), (pred / gt))
#a1 = (thresh < 1.25).mean()
#a2 = (thresh < 1.25 ** 2).mean()
#a3 = (thresh < 1.25 ** 3).mean()

# abs_rel absolute relative error
abs_rel = np.mean(np.divide(np.abs(gt - pred),  gt, out=np.zeros_like(pred), where=gt!=0))
sq_rel = np.mean(np.divide(((gt - pred) ** 2), gt, out=np.zeros_like(pred), where=gt!=0))

rmse = (gt - pred) ** 2
rmse = np.sqrt(rmse.mean())

rmse_log = (safe_log(gt) - safe_log(pred)) ** 2
rmse_log = np.sqrt(rmse_log.mean())

# to compute SILog metric
err = safe_log(pred) - safe_log(gt)

silog = np.sqrt(np.mean(err ** 2) - np.mean(err) ** 2) * 100
if np.isnan(silog):
    silog = 0

log_10 = (np.abs(safe_log10(gt) - safe_log10(pred))).mean()
logger.info('abs_rel: {}\t rmse: {}\t log_10: {}\t rmse_log: {}\t silog: {}\t sq_rel: {}\t'.format(abs_rel, rmse, log_10, rmse_log, silog, sq_rel))
return [abs_rel, rmse, log_10, rmse_log, silog, sq_rel]

`

` def eval_step(self, batch, batch_idx, prefix: str): (sweepimgs, mats, , imgmetas, , gt_labels, depth_labels) = batch

(sweepimgs, mats, , imgmetas, , _) = batch

    if torch.cuda.is_available():
        for key, value in mats.items():
            mats[key] = value.cuda()
        sweep_imgs = sweep_imgs.cuda()
        gt_labels = [gt_label.cuda() for gt_label in gt_labels]
    preds, depth_preds = self.model(sweep_imgs, mats)
    if len(depth_labels.shape) == 5:
        depth_labels = depth_labels[:, 0, ...]

    depth_labels = self.get_downsampled_gt_depth(depth_labels.cuda())
    depth_preds = depth_preds.permute(0, 2, 3, 1).contiguous().view(-1, self.depth_channels)
    fg_mask = torch.max(depth_labels, dim=1).values > 0.0
    depth_result = calculate(depth_labels[fg_mask].cpu().numpy(), np.round(depth_preds[fg_mask].cpu().numpy(), 2))
    if isinstance(self.model, torch.nn.parallel.DistributedDataParallel):
        results = self.model.module.get_bboxes(preds, img_metas)
    else:
        results = self.model.get_bboxes(preds, img_metas)
    for i in range(len(results)):
        results[i][0] = results[i][0].tensor.detach().cpu().numpy()
        results[i][1] = results[i][1].detach().cpu().numpy()
        results[i][2] = results[i][2].detach().cpu().numpy()
        results[i].append(img_metas[i])
    return results`

The following result is different from your paper ablation experiment with your pretrained bev_depth_lss_r50_256x704_128x128_20e_cbgs_2key_da.pth weight. image

yinchimaoliang commented 2 years ago

I think the problem is how you used depth. The depth you acquired from model is depth distribution and you just used the distribution to compute depth metrics.

Tony-Hou commented 2 years ago

I think the problem is how you used depth. The depth you acquired from model is depth distribution and you just used the distribution to compute depth metrics.

What is the correct method to calculate the depth metrices?

yinchimaoliang commented 2 years ago

This will get you the depth prediction values. depth_preds = (depth_preds torch.arange(0, self.depth_channels, device="cuda")[None, None, :]).sum(2) depth_preds = (depth_preds + 0.5) self.dbound[2] + self.dbound[0]