bmild / nerf

Code release for NeRF (Neural Radiance Fields)
http://tancik.com/nerf
MIT License
9.6k stars 1.34k forks source link

How to read the depth map? #77

Closed SYSUGrain closed 3 years ago

SYSUGrain commented 3 years ago

I read the depth map in synthetic test file by d[u][v] = (255-depth[u][v])/255, but the final results seems not consistent in multi-view. So how can I get the accurate depth map?

markomih commented 3 years ago

I believe you need to multiply it by the far bound (6 for the blender dataset): d[u][v] = far*(255-depth[u][v])/255

tancik commented 3 years ago

We never used the depth data in the project so I can't confirm that it is correct. These files were mistakenly included in the dataset. We decided not to delete them just in case someone found them useful.

NagabhushanSN95 commented 1 year ago

@markomih thanks! It works :)

LangHiKi commented 1 year ago

@markomih thanks! It works :)

Hello! I meet the same problem. May I ask for the script for checking the consistency of depth map in multi-view?

NagabhushanSN95 commented 1 year ago

Hi @LangHiKi, unfortunately I think I didn't save the exact code I used to verify. But what I did is as follows:

  1. Converted depth images to actual depth using the above formula
  2. Using warping code from here, I warped one of the frames to the view of the other.
  3. Ideally the warped image should overlap exactly with the true image. However due to some inconsistencies in depth map around the edges, there will be some faint lines. But as long as most of the scene matches, it should be fine.

PS: Also, when I was trying, there was some shift between the warped and the true images. I didn't think much about it at that time since the pose alignment was correct. But now that I think about it, there should be some residual scale factor when converting the depth. It should be easy to find the scale factor by trial-and-error.

LangHiKi commented 1 year ago

@NagabhushanSN95 Thank you for your reply! I met the similar problem as you mentioned in "PS" before. After fusing depth maps, the transformed objects from different views are at the same place but not overlapped perfectly. They seem to lack a rotation. I'll try your approach later, thanks again!

NagabhushanSN95 commented 1 year ago

If you used a code similar to mine, camera pose has to be in world2camera format. I think (don't remember for sure), the dataset poses are in camera2world format. So, you might have to take the inverse of poses before any processing.

zParquet commented 6 months ago

Thanks for the hints! However, I found that the transformation provided by @markomih didn't actually work. I tried some variants and found that the depth should be depth = 8*(255-depth)/255/norm(rays_d) (But I'm not sure whether 8 comes from near+far). Here is my code to visualize pointclouds from multi view:

for idx in [1,10,100]:
    rays_o, rays_d = get_rays(H, W, K, torch.Tensor(poses[idx]))  # (H, W, 3), (H, W, 3)
    norm = rays_d.norm(p=2,dim=2,keepdim=True)
    depth_im = cv2.imread(os.path.join('data/nerf_syn/ship/test', 'r_{}_depth_0002.png'.format(idx))).astype(float)[:,:,0]
    transformed = 8*(255-depth_im)/255
    depth_im[depth_im!=0] = transformed[depth_im!=0]
    points = rays_o + rays_d * depth_im[:,:,None] / norm
    trimesh.Trimesh(points.reshape(-1,3)[::2]).export('points{}.ply'.format(idx))

It can be seen the pointclouds from 3 views(green, red, yellow) are accurately fused together.

image

TSDF fusion based on this depth map further demonstrates the correctness.

image