GANWANSHUI / SimpleOccupancy

(IEEE TIV) A Comprehensive Framework for 3D Occupancy Estimation in Autonomous Driving
203 stars 11 forks source link

Questions about depth map visualization and depth rendering #17

Open Lazyangel opened 5 months ago

Lazyangel commented 5 months ago

Hello and thank you for your great work. I have encountered some problems and hope to communicate with you.Forgive me, I need to spend some space to explain my problem.

  1. First of all, the depth rendered by your method achieves lower Abs rel than traditional depth estimation methods, but the visualization of the depth map looks worse. So my first question is, what causes this? image

  2. To explore this question, I did some experiments on Kitti (I don't have sufficient GPU resources, so I chose a smaller dataset). I expected to achieve better results than traditional self-supervised depth estimation method(like monodepth2), but the abs rel was just close to monodepth2, and it took more training time. At the same time, the visualization of the depth map still looks worse than monodepth2.

  3. Further, I would like to know if using gt occupancy label can render a better depth map?So I used semantic gt label to render some depth maps and got the following results. image

This looks strange, I found it's a problem with the visualization depth function. I need to add direct=Truein visualize-depth: pred_depth_color=visualize_depth (pred_depth. copy()).And then I got a normal result. image So my second question is, does volume rendering get depth or disp? Do I need depth or disp to visualize depth maps? I think visualizing depth maps requires disp rather than depth, so your method needs to first convert depth to disp in visualization, and this is also true in monodepth2.

def visualize_depth(depth, mask=None, depth_min=None, depth_max=None, direct=False):
    """Visualize the depth map with colormap.
       Rescales the values so that depth_min and depth_max map to 0 and 1,
       respectively.
    """
    if not direct:
        depth = 1.0 / (depth + 1e-6)
    invalid_mask = np.logical_or(np.isnan(depth), np.logical_not(np.isfinite(depth)))
    if mask is not None:
        invalid_mask += np.logical_not(mask)
    if depth_min is None:
        depth_min = np.percentile(depth[np.logical_not(invalid_mask)], 5) # 0.027  0.02
    if depth_max is None:
        depth_max = np.percentile(depth[np.logical_not(invalid_mask)], 95) # 9.99  0.169
    depth[depth < depth_min] = depth_min
    depth[depth > depth_max] = depth_max
    depth[invalid_mask] = depth_max

    depth_scaled = (depth - depth_min) / (depth_max - depth_min)
    depth_scaled_uint8 = np.uint8(depth_scaled * 255)
    depth_color = cv2.applyColorMap(depth_scaled_uint8, cv2.COLORMAP_MAGMA)
    depth_color[invalid_mask, :] = 0

    return depth_color

But what confuses me is that when I use GT occ label instead of pred occ prob, the visualization is incorrect. Have I misunderstood anything? Looking forward to your reply!

GANWANSHUI commented 5 months ago

Hi, thanks for the question.

1, 2. We also observe this thing. The visualization may not truly reflect the performance. Then encoder-decoder fashion (monodepth2) usually has a sharper depth map than the rendering. Maybe you can do the depth error map visualization to have a deeper investigation.

  1. The depth map by rendering is always the metric depth not disparity, I think there is no need to adjust the render depth map before the visualization function.
Lazyangel commented 5 months ago

非常感谢您的回复!(不想再使用蹩脚的机翻英文了

Lazyangel commented 5 months ago

Hi, thanks for the question.

1, 2. We also observe this thing. The visualization may not truly reflect the performance. Then encoder-decoder fashion (monodepth2) usually has a sharper depth map than the rendering. Maybe you can do the depth error map visualization to have a deeper investigation. 3. The depth map by rendering is always the metric depth not disparity, I think there is no need to adjust the render depth map before the visualization function.

您好,不好意思再次打扰您,我有一些问题想继续请教: 1.关于第三点,确实不应该修改可视化函数,但是由于GT occ标签在天空部分没有占用导致这些区域渲染的深度为0。所以可视化结果呈现出天空为黄色,而路面区域为全黑的现象。您有什么好的建议让我可以获取更好的用gt occ标签可视化深度图吗?

  1. 我在可视化gt occ深度图时,使用gt直接替代了3D CNN的输出结果:

    Voxel_feat_list = self._3DCNN(Voxel_feat)
    if self.opt.use_gt_occ:
    Voxel_feat_list = [inputs['occ']]

    但我对gt occ渲染的深度计算评价指标得到的结果有很大误差,我不明白错在哪里?

    abs_rel |   sq_rel |     rmse | rmse_log |       a1 |       a2 |       a3 | 
    &   0.120  &   1.988  &   9.422  &   0.943  &   0.875  &   0.923  &   0.944  

    我猜想可能和对3DCNN输出结果进行sigmoid操作的顺序有关?原代码中先对采样点插值后sigmoid获取prob,但在这里我似乎只能直接获取prob(gt中未占用的prob=0,占用的prob=1),所以插值时是对prob进行插值。您能解释一下您的代码中先对采样点插值后sigmoid获取prob的原因吗?我尝试修改了代码在这部分的处理顺序,即先sigmoid获取prob后插值,但是这样修改之后却训练失败了。 3.我尝试获取了预测深度图和gt深度图的误差图并将他们可视化出来了,但是我似乎并不能得到有用信息,因为我无法区分不同区域的误差大小。 image 您有更好的误差图可视化方法建议吗?我的可视化代码如下:

    def visualize_error(pred_depth, gt_depth):
    mask = gt_depth != 0
    masked_pred_depth = np.where(mask, pred_depth, 0) 
    error = np.abs(masked_pred_depth - gt_depth)
    error = 1.0 / (error + 1e-6)
    invalid_mask = np.logical_or(np.isnan(error), np.logical_not(np.isfinite(error)))
    # error_min = np.percentile(error[np.logical_not(invalid_mask)], 5) # 0.027  0.02
    # error_max = np.percentile(error[np.logical_not(invalid_mask)], 95) # 9.99  0.169
    valid_errors = error[np.nonzero(error)]  #
    error_min = np.min(valid_errors)
    # error_min = np.min(error)
    error_max = np.max(error)
    
    error[error < error_min] = error_min
    error[error > error_max] = error_max
    error[invalid_mask] = error_max
    
    error_scaled = (error - error_min) / (error_max - error_min)
    error_scaled_uint8 = np.uint8(error_scaled * 255)
    error_color = cv2.applyColorMap(error_scaled_uint8, cv2.COLORMAP_MAGMA)
    error_color[invalid_mask, :] = 0
    
    return error_color

    如果您能给出一些建议和帮助,我将感激不尽!期待您的回复。

GANWANSHUI commented 5 months ago

1,2) 可以在每条ray的采样点最后一个点,把那个值设置成一个固定的数,例如100,这样能确保天空部分有最大的深度值。至于较大误差,一部分原因可能也跟你之前没有天空部分的深度值有关,同时离散空间的采样也会带来一些误差。

3)误差图我没有仔细研究过,理想情况是类似这种误差图,可以多找几种对比看看。