renderer to produce binary image

Summary

Rendering a binary image or converting a rendered image to a binary format while preserving differentiability.

System Configuration

System information:

OS: Ubuntu 20.04.6 LTS
CPU: 13th Gen Intel® Core™ i7-13700K
GPU: NVIDIA GeForce RTX 3090 Ti
Python: 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21) [GCC 12.3.0]
NVIDIA Driver: 555.42.02
CUDA: 11.6.124
LLVM: 12.0.0
Dr.Jit: 0.4.6
Mitsuba: 3.5.2
- Is custom build? No
- Compiled with: GNU 10.2.1
- Variants:
  - scalar_rgb
  - scalar_spectral
  - cuda_ad_rgb
  - llvm_ad_rgb

Description

I have been following this notebook on object pose estimation. My goal is to use a real image as the reference but with a binary mask for guidance. In PyTorch3D, this can be done using the SoftSilhouetteShader to render a binary image. However, I couldn’t find a similar function in Mitsuba/Dr.Jit, so I tried converting the rendered image (TensorXf) into a binary image manually. Unfortunately, my approach seems to break differentiability. Below is the code I used:

def to_binary_mask(input):
    # input is TensorXf  
    input_1 = input[:, :, 0]
    input_2 = input[:, :, 1]
    input_3 = input[:, :, 2]

    a = dr.select(input_1 < 0.7, 0, 1)
    b = dr.select(input_2 < 0.7, 0, 1)
    c = dr.select(input_3 < 0.7, 0, 1)

    combined = a + b + c
    return dr.select(combined < 0.3, 1, 0)

This function is used in the optimization loop:

import time
from drjit.cuda.ad import Float, UInt32, TensorXf

img_ref = cv2.imread("mask.png", cv2.IMREAD_GRAYSCALE)
img_ref = cv2.resize(img_ref, (width, height)) / 255
img_ref = TensorXf(img_ref)

loss_hist = []
for it in range(iteration_count):
    # Apply the mesh transformation
    apply_transformation(params, opt)

    # Perform a differentiable rendering
    img = mi.render(scene, params, seed=it, spp=spp)
    img_binary = to_binary_mask(img)

    # Evaluate the objective function
    loss = dr.sum(dr.sqr(img_binary - img_ref)) / len(img_binary)

    # Backpropagate through the rendering process
    dr.backward(loss)

    # Optimizer: take a gradient descent step
    opt.step()
    loss_hist.append(loss)
    print(f"Iteration {it:02d}: error={loss[0]:6f}, angle={opt['angle'][0]:.4f}, trans=[{opt['trans'].x[0]:.4f}, {opt['trans'].y[0]:.4f}]", end='\r')

Error

I encounter the following error when running the code:

TypeError: backward_from(): the argument does not depend on the input variable(s) being differentiated. Raising an exception since this is usually indicative of a bug (for example, you may have forgotten to call dr.enable_grad(..)). If this is expected behavior, skip the call to backward_from(..) if dr.grad_enabled(..) returns False.

I think this should not be a challenge issue, any advice would be helpful.

mitsuba-renderer / mitsuba3