shader-slang / slang-python

Superseded by github.com/shader-slang/slang-torch
MIT License
27 stars 3 forks source link

Accumulating gradient_output in the BRDF example #13

Open bprb opened 10 months ago

bprb commented 10 months ago

Hello, I was wondering if the gradient_output in the BRDF example perhaps needs a zero_() in the learning inner loop (ie. before calling m.brdf_loss.bwd) ? Similar to calling optimizer.zero_grad() in PyTorch.

Otherwise, wouldn't the code accumulate the gradient with each sample, while also immediately applying it?

Apologies if this is intentional, or a zero-fill is already implied somewhere, or I misunderstood :)

Thanks! bert

PS Sorry, I don't have Jupyter set up to test a merge request.


for i in range(10000):
    L = random_hemi_vector()
    V = (0.0, 0.0, 1.0)
    input_params = (*L, *V)
    loss_output = torch.zeros((original_shape[0], original_shape[1], 1)).cuda()
    output_grad = torch.ones_like(loss_output).cuda()
    m.brdf(input=full_res_brdf,
           output=lighting_from_full_res_brdf,
           input_params=input_params).launchRaw(blockSize=block_size, gridSize=grid_size)

    gradient_output.zero_()    # ++++++++++++++++++++++
    m.brdf_loss.bwd(input=(half_res_brdf, gradient_output),
                    output=(loss_output, output_grad),
                    reference=lighting_from_full_res_brdf,
                    input_params=input_params).launchRaw(blockSize=block_size, gridSize=grid_size)
    # Clip gradients and prevent.
    gradient_output = torch.nan_to_num(gradient_output, 0.0)
    gradient_output = torch.clamp(gradient_output, -1.0, 1.0)
    half_res_brdf = torch.clip(half_res_brdf - 0.001 * gradient_output, 0.0001, 1.0)
FlorentGuinier commented 6 months ago

Agreed, imho gradient should be zeroed in the loop :)