google / mipnerf

Apache License 2.0
894 stars 109 forks source link

Is it safe to use unnormalized ray directions while sampling points? #36

Closed nitchith closed 1 year ago

nitchith commented 2 years ago

Hi,

While sampling points along rays, the code uses rays.directions for the direction vectors instead of rays.viewdirs. https://github.com/google/mipnerf/blob/84c969e0a623edd183b75693aed72a7e7c22902d/internal/models.py#L70-L81

Original NeRF uses normalized direction vectors for the sampling points. Can you clarify if we need to replace rays.directions with rays.viewdirs?

jonbarron commented 2 years ago

Why do you want to do this? The only thing I can say with confidence is that the code works as-is. If you make that change I'm not sure what will happen.

nitchith commented 1 year ago

I just found out the problem of using normalized directions here. Using normalized directions will make some of the sampled points fall before the near plane and will have very few sampled points near the far plane.

elenacliu commented 1 year ago

@nitchith hi nitchith, I don't know if you have seen the latest discussions about that, I copy my comment on the issue:

Yes, it is true that r=o+td is shown in the paper, but here t is distance or transmission time along the ray, not in the direction of -z axis (as shown in the picture of sillsill77's first reply). If you want to use the equation (where d is normalized direction vector and t is the linear interpolation of start distance and end distance), t cannot be simply calculated by near plane and far plane's interpolation. If you still calculate t through linear interpolation of near plane and far plane, near surface and far surface of the sampling volume are actually parts of two spheres.

If you want you can refer to nerfstudio's implementation:

https://github.com/nerfstudio-project/nerfstudio/blob/6d51af1a24d692d073a9e0cda06278b6a9d44818/nerfstudio/model_components/ray_samplers.py#L386

but the code in the repo:

https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf.py#L381'

actually shows us r=o+z_vals*rays_d.

and rays_d is:

https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf_helpers.py#L153-L162

you can see that dirs is view direction vectors in camera space, and rays_d is the corresponding value in world space, so if we think of the problem in the camera space, dirs's 3rd dim is -1, which means the equation r=o+z_vals*rays_d just samples points linearly along the -z axis, from the near plane to the far plane.