Loss not tracking gradient

a-lemus96 / fs-nerf

PyTorch implementation for experimenting with frequency regularized Neural Radiance Fields.

1 stars 0 forks source link

Loss not tracking gradient #69

Closed a-lemus96 closed 8 months ago

a-lemus96 commented 8 months ago

Encountered the following error


Traceback (most recent call last):

  File "/home/lemus/projects/fs-nerf/src/run-nerf.py", line 535, in <module>

    main()

  File "/home/lemus/projects/fs-nerf/src/run-nerf.py", line 435, in main

    train(

  File "/home/lemus/projects/fs-nerf/src/run-nerf.py", line 268, in train

    loss.backward()

  File "/home/lemus/miniconda3/envs/nerf/lib/python3.11/site-packages/torch/_tensor.py", line 487, in backward

    torch.autograd.backward(

  File "/home/lemus/miniconda3/envs/nerf/lib/python3.11/site-packages/torch/autograd/__init__.py", line 200, in backward

    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Originally posted by @a-lemus96 in #61

a-lemus96 commented 8 months ago

rgb does not require grad

> /home/lemus/projects/fs-nerf/src/run-nerf.py(241)train()
-> rgb_gt = rgb_gt.to(device)
(Pdb) rgb.requires_grad
False

a-lemus96 commented 8 months ago

Problem might be due to extreme cases in which no samples are available from sampling step. In the first iteration, tensors from sampling method are empty as per pdb

> /home/lemus/projects/fs-nerf/src/render/renderer.py(84)render_rays()
-> ray_idxs, t_starts, t_ends = self.estimator.sampling(
(Pdb) n
> /home/lemus/projects/fs-nerf/src/render/renderer.py(94)render_rays()
-> def _rgb_sigma_fn(t_starts, t_ends, ray_idxs):
(Pdb) ray_idxs.shape
torch.Size([0])
(Pdb) t_starts.shape
torch.Size([0])

a-lemus96 commented 8 months ago

Removed try-catch statement from render_rays fn

a-lemus96 commented 8 months ago

When running python run-nerf.py --debug using pdb module I found that model outputted tensors do track gradients

> /home/lemus/projects/fs-nerf/src/run-nerf.py(238)train()
-> raw = model(rays_o)
(Pdb) n
> /home/lemus/projects/fs-nerf/src/run-nerf.py(239)train()
-> render_output = renderer.render_rays(rays_o, rays_d, model)
(Pdb) raw.requires_grad
True

a-lemus96 commented 8 months ago

During the first iteration, since the ray_idxs length is zero as well as for t_starts and t_ends, the nerfacc package bypasses _sigma_rgb_fn call and instead creates two empty tensors that do not track gradients. This can be verified from nerfacc module base code:

https://github.com/nerfstudio-project/nerfacc/blob/32273f838184f4c6345b8371aaa17f2a13d62adf/nerfacc/volrend.py#L94