mitsuba-renderer / mitsuba2

Mitsuba 2: A Retargetable Forward and Inverse Renderer
Other
2.04k stars 267 forks source link

[🐛 bug report] Differentiating with respect to point light intensity (PointLight.intensity.value) #320

Open Spandan-Madan opened 3 years ago

Spandan-Madan commented 3 years ago

Summary

I'm trying to use inverse rendering to optimize the intensity of a point light (PointLight.intensity.value). It runs for a few iterations and then randomly breaks with this error:

render_torch(): critical exception during backward pass: set_gradient(): no gradients are associated with this variable (a prior call to requires_gradient() is required.) 

The surprising thing is that it breaks after a variable number of iterations - sometimes after the first, at other times after going through 5 or even 10 iterations.

How can this be addressed?

merlinND commented 3 years ago

Is it possible that the value of your light source goes to 0 or becomes negative? This error comes up when the parameter you're trying to optimize was not involved at all in the creation of the image (or more precisely, in computation of the loss function). It could happen e.g. if all rays evaluate to NaN, or if the intensity value gets clamped to a positive value.

Spandan-Madan commented 3 years ago

Hi @merlinND, thanks for your response. I just checked again - I get this error without hitting 0 or negative values.

Also there's a pattern - I get it every time in the first iteration. Then when I re-run it goes on a few iterations and then crashes again.

Stack for error after 0th iteration:

{'PointLight.intensity.value': tensor([[100.,  10.,  10.]], device='cuda:0', requires_grad=True)}
tensor(0.8446, device='cuda:0', grad_fn=<SelectBackward>)
render_torch(): critical exception during backward pass: set_gradient(): no gradients are associated with this variable (a prior call to requires_gradient() is required.) 

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-25-500b32e99c8a> in <module>
     20 
     21     # Back-propagate errors to input parameters
---> 22     ob_val.backward()
     23 
     24     # Optimizer: take a gradient step

~/.local/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    183                 products. Defaults to ``False``.
    184         """
--> 185         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    186 
    187     def register_hook(self, hook):

~/.local/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128 
    129 

RuntimeError: set_gradient(): no gradients are associated with this variable (a prior call to requires_gradient() is required.) 

Stack for second time error happens:

iter 2:
tensor(0.8978, device='cuda:0', grad_fn=<SelectBackward>)
{'PointLight.intensity.value': tensor([[100.1086,   9.8907,  10.1093]], device='cuda:0', requires_grad=True)}

iter3:
tensor(0.8136, device='cuda:0', grad_fn=<SelectBackward>)
{'PointLight.intensity.value': tensor([[100.1205,   9.8740,  10.1363]], device='cuda:0', requires_grad=True)}

iter4:
tensor(0.8164, device='cuda:0', grad_fn=<SelectBackward>)
render_torch(): critical exception during backward pass: set_gradient(): no gradients are associated with this variable (a prior call to requires_gradient() is required.) 
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-500b32e99c8a> in <module>
     20 
     21     # Back-propagate errors to input parameters
---> 22     ob_val.backward()
     23 
     24     # Optimizer: take a gradient step

~/.local/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    183                 products. Defaults to ``False``.
    184         """
--> 185         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    186 
    187     def register_hook(self, hook):

~/.local/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128 
    129 

RuntimeError: set_gradient(): no gradients are associated with this variable (a prior call to requires_gradient() is required.) 
merlinND commented 3 years ago

Quite mysterious... Does it occur if you don't use PyTorch (e.g. reproducing your experiment in a demo script like invert_cbox.py)? And does it occur if you don't update the value of the parameter after each iteration (no optimizer step)?

Spandan-Madan commented 3 years ago

Hi @merlinND, tested a few things - The cbox.xml scene did not give this error, but it had an area emitter instead of a point emitter. When I tried adding an area emitter to my scene and differentiating with respect to its intensity that worked.

So I guess this problem happens with Point lights but not with Area lights. Any leads on how we should proceed with debugging?

merlinND commented 3 years ago

I would suggest also trying the other way around: if you change the cbox to use a point light, does invert_cbox.py work? And then, what if you introduce PyTorch to that script? Basically, adding changes to a known working script, one by one, until it breaks to identify the source of the bug.