dunbar12138 / DSNeRF

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)
https://www.cs.cmu.edu/~dsnerf/
MIT License
746 stars 126 forks source link

" allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass RuntimeError: Function 'PowBackward0' returned nan values in its 0th output" #85

Open rockywind opened 1 year ago

rockywind commented 1 year ago

Hi, I met the error after training for a while!

|▏| 9918/50000 [1:55:02<4:27:26, 20%|▏| 9919/50000 [1:55:02<4:29:41, 20%|▏| 9920/50000 [1:55:03<4:28:16, 20%|▏| 9921/50000 [1:55:03<4:43:27, 20%|▏| 9922/50000 [1:55:04<4:37:07, 20%|▏| 9923/50000 [1:55:04<4:38:23, 20%|▏| 9924/50000 [1:55:04<4:34:04, 20%|▏| 9925/50000 [1:55:05<4:32:37, 20%|▏| 9926/50000 [1:55:05<4:31:07, 20%|▏| 9927/50000 [1:55:06<4:30:17, 20%|▏| 9928/50000 [1:55:09<13:11:34 20%|▏| 9929/50000 [1:55:09<10:34:44,  1.05i/opt/conda/lib/python3.7/site-packages/torch/autograd/__init__.py:175: UserWarning: Error detected in PowBackward0. Traceback of forward call that caused the error:
  File "run_nerf_own_train.py", line 1138, in <module>
    train()
  File "run_nerf_own_train.py", line 1006, in train
    img_loss0 = img2mse(extras['rgb0'], target_s)
  File "/rockywin.wang/NeRF/DSNeRF/run_nerf_helpers.py", line 15, in <lambda>
    img2mse = lambda x, y : torch.mean((x.to(device) - y.to(device)) ** 2)
  File "/opt/conda/lib/python3.7/site-packages/torch/_tensor.py", line 32, in wrapped
    return f(*args, **kwargs)
 (Triggered internally at  ../torch/csrc/autograd/python_anomaly_mode.cpp:102.)
  allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
 20%|▏| 9929/50000 [1:55:09<7:44:45,  1.44it
Traceback (most recent call last):
  File "run_nerf_own_train.py", line 1138, in <module>
    train()
  File "run_nerf_own_train.py", line 1010, in train
    loss.backward()
  File "/opt/conda/lib/python3.7/site-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/opt/conda/lib/python3.7/site-packages/torch/autograd/__init__.py", line 175, in backward
    allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: Function 'PowBackward0' returned nan values in its 0th output.

My config is that below.

expname = parking2_noc
basedir = ./logs/parking2_noc
datadir = /rockywin.wang/NeRF/DSNeRF/data/parking2
dataset_type = llff #colmap_llff
factor = 4 
llffhold = 8
N_rand = 4096
N_samples = 64
N_importance = 128
use_viewdirs = True
raw_noise_std = 1e0
no_ndc = False
colmap_depth = True
depth_loss = True
depth_lambda = 0.1
i_testset = 5000
i_video = 10000
N_iters = 50000
dunbar12138 commented 1 year ago

Your model's output seems to contain NaN. Check the related issue here.