apple / ml-neuman

Official repository of NeuMan: Neural Human Radiance Field from a Single Video (ECCV 2022)
Other
1.26k stars 141 forks source link

Question about near far computation #58

Open buaacyw opened 1 year ago

buaacyw commented 1 year ago

Hi, thanks for your work! I have below questions about near-far computation in NeuMan:

  1. In geometry_guided_near_far_torch method, why the delta z is computed in this way? And I found the geometry_threshold is calculated by this way. Why do you choose the distance between that two joints?
  2. In ray_to_samples method, samples are calculated by pts = rays_o + rays_d * z_vals, in which rays_d is normalized directions and z_vals are sampled between near and far. The near and far are depth (vertical distance with the image plane, but not distance with the principal point). But if points are sampled by pts = rays_o + rays_d * z_vals, the vertical distance between these points and the image plane is not from near and far. Why is this? Is this wrong?

Thanks!

jiangwei221 commented 1 year ago

Hi,

  1. This is inspired by NeuralActor, you can check the Appendix of A.2 of NeuralActor. As for the distance threshold, we want it to adapt to each individual, so we used the bone length. The selected bone is a roughly good length from our experience.
  2. I see, I think you are right, we need to consider the angle between the ray and camera Z-axis to compensate. We can add a fix to the dataloader: https://github.com/apple/ml-neuman/blob/84d4685074327227457276f1b60d467f3dab1211/datasets/background_rays.py#L86-L87
buaacyw commented 1 year ago

Thanks. I understand! I think we only need to fix the background rays. The near and far of human rays are correct. Since they are calculated in this way: https://github.com/apple/ml-neuman/blob/84d4685074327227457276f1b60d467f3dab1211/utils/ray_utils.py#L204-L219

The z0 is the distance with the principal point. I found most of the NeRF works haven't taken this into consideration (including the origin NeRF). But they can still get a well-looking depth map.