mitsuba-renderer / mitsuba3

Mitsuba 3: A Retargetable Forward and Inverse Renderer
https://www.mitsuba-renderer.org/
Other
2.1k stars 246 forks source link

renderer to produce binary image #1299

Closed ZheningHuang closed 2 months ago

ZheningHuang commented 2 months ago

Summary

Rendering a binary image or converting a rendered image to a binary format while preserving differentiability.

System Configuration

System information:

Description

I have been following this notebook on object pose estimation. My goal is to use a real image as the reference but with a binary mask for guidance. In PyTorch3D, this can be done using the SoftSilhouetteShader to render a binary image. However, I couldn’t find a similar function in Mitsuba/Dr.Jit, so I tried converting the rendered image (TensorXf) into a binary image manually. Unfortunately, my approach seems to break differentiability. Below is the code I used:

def to_binary_mask(input):
    # input is TensorXf  
    input_1 = input[:, :, 0]
    input_2 = input[:, :, 1]
    input_3 = input[:, :, 2]

    a = dr.select(input_1 < 0.7, 0, 1)
    b = dr.select(input_2 < 0.7, 0, 1)
    c = dr.select(input_3 < 0.7, 0, 1)

    combined = a + b + c
    return dr.select(combined < 0.3, 1, 0)

This function is used in the optimization loop:

import time
from drjit.cuda.ad import Float, UInt32, TensorXf

img_ref = cv2.imread("mask.png", cv2.IMREAD_GRAYSCALE)
img_ref = cv2.resize(img_ref, (width, height)) / 255
img_ref = TensorXf(img_ref)

loss_hist = []
for it in range(iteration_count):
    # Apply the mesh transformation
    apply_transformation(params, opt)

    # Perform a differentiable rendering
    img = mi.render(scene, params, seed=it, spp=spp)
    img_binary = to_binary_mask(img)

    # Evaluate the objective function
    loss = dr.sum(dr.sqr(img_binary - img_ref)) / len(img_binary)

    # Backpropagate through the rendering process
    dr.backward(loss)

    # Optimizer: take a gradient descent step
    opt.step()
    loss_hist.append(loss)
    print(f"Iteration {it:02d}: error={loss[0]:6f}, angle={opt['angle'][0]:.4f}, trans=[{opt['trans'].x[0]:.4f}, {opt['trans'].y[0]:.4f}]", end='\r')

Error

I encounter the following error when running the code:

TypeError: backward_from(): the argument does not depend on the input variable(s) being differentiated. Raising an exception since this is usually indicative of a bug (for example, you may have forgotten to call dr.enable_grad(..)). If this is expected behavior, skip the call to backward_from(..) if dr.grad_enabled(..) returns False.

image

I think this should not be a challenge issue, any advice would be helpful.

merlinND commented 2 months ago

Hello @ZheningHuang,

dr.select() is effectively a step function, which is discontinuous and therefore not differentiable. So the error "the argument does not depend on the input variable(s) being differentiated" is the expected result.

I think you will first need to think about a differentiable formulation of your problem (maybe taking inspiration from what is done in PyTorch3D), and then implement it in Mitsuba.