yenchenlin / nerf-pytorch

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.
MIT License
5.34k stars 1.04k forks source link

Normalize rays_d #76

Open jjparkcv opened 2 years ago

jjparkcv commented 2 years ago

Hi, thank you for sharing a good codebase.

In https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf_helpers.py#L159 you are using an unnormalized "rays_d"

Shouldn't this be normalized so that it indicates a direction vector?

Thanks in advance.

nschoen commented 2 years ago

I noticed this as well. Later viewdirs is used as input to the model. viewdirs is basically rays_d normalized, see https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf.py#L108

However, at the same time the unnormalized rays_d is used to calculate the points along the ray with the help of z_vals https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf.py#L381

Wouldn't it be better to just normalize rays_d at the beginning and remove viewdirs or is the separation of viewdirs and unnormalized rays_d specifically required?

taehoon-yoon commented 2 years ago

In my opinion, you have to use unnormalized rays_d to properly represent view frustum. Look at figure below

Since rays_d z component is -1, and z_vals range between near~far, multiplying them will make sampled point lying inside the proper view frustum. In contrast if you use normalized viewdir to sample points, the points will lie on frustum which looks like a portion of sphere where inner surface radius is near and outter surface radius is far.

JunjieLl commented 1 year ago

so how about this? Can someone give me some advice on the ray direction? https://github.com/yenchenlin/nerf-pytorch/issues/99#issuecomment-1502734614

JunjieLl commented 1 year ago

In my opinion, you have to use unnormalized rays_d to properly represent view frustum. Look at figure below Since rays_d z component is -1, and z_vals range between near~far, multiplying them will make sampled point lying inside the proper view frustum. In contrast if you use normalized viewdir to sample points, the points will lie on frustum which looks like a portion of sphere where inner surface radius is near and outter surface radius is far.

I can't agree with you, because r=o+td means how far you go through the line, so the length of d must be equal to 1

taehoon-yoon commented 1 year ago

@JunjieLl In definition, you are right in some point. But in implementation, they used rays_d which has z component -1. And regarding your problem, that is why they wrote the code https://github.com/yenchenlin/nerf-pytorch/blob/63a5a630c9abd62b0f21c08703d0ac2ea7d4b9dd/run_nerf.py#L280 If the length of d must be equal to 1, then why they need to multiply the norm of rays_d to dists? Since rays_d is not normalized (which is the reason to make proper view frustum) they multiplied the length of rays_d to properly calculate delta_i. (consult equation (3) in NeRF paper)

elenacliu commented 1 year ago

@JunjieLl

I think @sillsill777 is correct.

yes, it is true that r=o+td is shown in the paper, but here t is distance or transmission time along the ray, not in the direction of -z axis (as shown in the picture of sillsill77's first reply). If you want to use the equation (where d is normalized direction vector and t is the linear interpolation of start distance and end distance), t cannot be simply calculated by near plane and far plane's interpolation. If you still calculate t through linear interpolation of near plane and far plane, near surface and far surface of the sampling volume are actually parts of two spheres.

If you want you can refer to nerfstudio's implementation:

https://github.com/nerfstudio-project/nerfstudio/blob/6d51af1a24d692d073a9e0cda06278b6a9d44818/nerfstudio/model_components/ray_samplers.py#L386

but the code in the repo:

https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf.py#L381'

actually shows us r=o+z_vals*rays_d.

and rays_d is:

https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/run_nerf_helpers.py#L153-L162

you can see that dirs is view direction vectors in camera space, and rays_d is the corresponding value in world space, so if we think of the problem in the camera space, dirs's 3rd dim is -1, which means the equation r=o+z_vals*rays_d just samples points linearly along the -z axis, from the near plane to the far plane.

wztdream commented 7 months ago

The explain should be very simple:

  1. r=o+td is definitely correct, where t is the length, d is the unit direction.
  2. in the code they use r = o + tD, where D is the un-normalized d, this equals r = o+t|D|d, where |D| is the norm of D. D is rays_d in the code. https://github.com/yenchenlin/nerf-pytorch/blob/63a5a630c9abd62b0f21c08703d0ac2ea7d4b9dd/run_nerf.py#L381
  3. when calculate the dist they at first use t only , t is z_val in the code, so they then multiply |D| to make it correct, as t|D| is the correct length. https://github.com/yenchenlin/nerf-pytorch/blob/63a5a630c9abd62b0f21c08703d0ac2ea7d4b9dd/run_nerf.py#L277-L280