[🐛 bug report] Enoki crash - use of undefined value

System configuration

Same as #3

Description

Here is a simple code that uses Enoki and Mitsuba 2 to draw 1000 random pixels into a 32x32 film using Loop.

The pixel positions are random, pixel colors are based on a spectrum distribution.

I am interested in knowing the derivative of the total film color w.r.t. the spectrum distribution.

Here is an example of how the output film looks like in RGB:

Crash 1

Sample code for a Jupyter notebook:

import os
import mitsuba
mitsuba.set_variant('llvm_ad_spectral')
from mitsuba.core import Thread, ScalarTransform4f, Float, Mask, UInt32, UInt64, Vector2f, Spectrum
from mitsuba.core.xml import load_file, load_dict, load_string
from mitsuba.render import ImageBlock
import numpy as np
from mitsuba.python.util import traverse
from typing import List, Set, Tuple, Dict
import enoki as ek

############################################################
# Helper functions:

def get_class(name):
    name = name.split('.')
    value = __import__(".".join(name[:-1]))
    for item in name[1:]:
        value = getattr(value, item)
    return value

def get_module(class_):
    return get_class(class_.__module__)

############################################################
# Initializing:

ENABLE_GRADIENTS = False
seed = 42
spp = 1

sampler = load_dict({"type": "independent", "sample_count": spp, "seed": seed})
sampler.seed(seed, wavefront_size=1)
# print(help(sampler))

distribution = load_dict({
    "type": "regular",
    "lambda_min": 400.0,
    "lambda_max": 700.0,
    "values": "0.2,0.5,0.9,0.5,0.2",
})

film = load_dict({
    "type": "hdrfilm",
    "banner": False,
    "rfilter": {"type": "box"},
    "width": 32,
    "height": 32,
})
film.prepare(["XYZAW"[i] for i in range(5)])

############################################################
# Drawing:

def draw_pixel(uv: mitsuba.core.Point2f, wavelengths: Spectrum, weights: Spectrum, block: ImageBlock, active = True):
    xyz = mitsuba.core.spectrum_to_xyz(weights, wavelengths, active)
    aovs = list(xyz) + [0.0, 1.0]
    block.put(uv, aovs, active)

############################################################
# Loop:

if ENABLE_GRADIENTS:
    params = traverse(distribution)
    params.keep(["values"])
    ek.enable_grad(params["values"])

block = ImageBlock(size=film.crop_size(), channel_count=5,
                   filter=film.reconstruction_filter(), warn_negative=False,
                   warn_invalid=False, border=False)
block.clear()

active = Mask(True)
remaining_bounces = UInt32(1000)

loop = get_module(Float).Loop("TestLoop12345")
loop.put(active, remaining_bounces)
sampler.loop_register(loop)
loop.init()
while loop(active):
    wavelengths, weights = distribution.sample_spectrum(mitsuba.render.SurfaceInteraction3f(), sampler.next_1d())
    draw_pixel(32 * sampler.next_2d(), wavelengths, weights, block)
    remaining_bounces -= 1
    active &= remaining_bounces > 0

film.put(block)

if ENABLE_GRADIENTS:
    data = block.data()
    ch = block.channel_count()
    i = ek.arange(UInt32, ek.hprod(block.size()))
    values_idx = i + 2 * (i // (ch - 2)) # select XYZ from XYZAW
    values = ek.gather(Float, data, values_idx)

    values_total = ek.hsum_async(values)

    ek.backward(values_total)
    print("Gradient:", ek.grad(params["values"]))

############################################################
# Drawing the film:

import matplotlib.pyplot as plt
from mitsuba.core import Bitmap, Struct
block_rgb = np.array(film.bitmap().convert(Bitmap.PixelFormat.RGB, Struct.Type.Float32, srgb_gamma=False))
plt.imshow(block_rgb)
plt.title("Film")
plt.show()

With ENABLE_GRADIENTS = False, it works fine. Once you set ENABLE_GRADIENTS = True, it crashes:

Crash 2

Exactly the same code (without derivatives for simplicity), but now I put the rendering loop into a separate Python function:

import os
import mitsuba
mitsuba.set_variant('llvm_ad_spectral')
from mitsuba.core import Thread, ScalarTransform4f, Float, Mask, UInt32, UInt64, Vector2f, Spectrum
from mitsuba.core.xml import load_file, load_dict, load_string
from mitsuba.render import ImageBlock
import numpy as np
from mitsuba.python.util import traverse
from typing import List, Set, Tuple, Dict
import enoki as ek

############################################################
# Helper functions:

def get_class(name):
    name = name.split('.')
    value = __import__(".".join(name[:-1]))
    for item in name[1:]:
        value = getattr(value, item)
    return value

def get_module(class_):
    return get_class(class_.__module__)

############################################################
# Initializing:

seed = 42
spp = 1

sampler = load_dict({"type": "independent", "sample_count": spp, "seed": seed})
sampler.seed(seed, wavefront_size=1)
# print(help(sampler))

distribution = load_dict({
    "type": "regular",
    "lambda_min": 400.0,
    "lambda_max": 700.0,
    "values": "0.2,0.5,0.9,0.5,0.2",
})

film = load_dict({
    "type": "hdrfilm",
    "banner": False,
    "rfilter": {"type": "box"},
    "width": 32,
    "height": 32,
})
film.prepare(["XYZAW"[i] for i in range(5)])

############################################################
# Drawing:

def draw_pixel(uv: mitsuba.core.Point2f, wavelengths: Spectrum, weights: Spectrum, block: ImageBlock, active = True):
    xyz = mitsuba.core.spectrum_to_xyz(weights, wavelengths, active)
    aovs = list(xyz) + [0.0, 1.0]
    block.put(uv, aovs, active)

def render_pixels(sampler, distribution, block: ImageBlock):
    active = Mask(True)
    remaining_bounces = UInt32(1000)

    loop = get_module(Float).Loop("TestLoop12345")
    loop.put(active, remaining_bounces)
    sampler.loop_register(loop)
    loop.init()
    while loop(active):
        wavelengths, weights = distribution.sample_spectrum(mitsuba.render.SurfaceInteraction3f(), sampler.next_1d())
        draw_pixel(32 * sampler.next_2d(), wavelengths, weights, block)
        remaining_bounces -= 1
        active &= remaining_bounces > 0

############################################################
# Loop:

block = ImageBlock(size=film.crop_size(), channel_count=5,
                   filter=film.reconstruction_filter(), warn_negative=False,
                   warn_invalid=False, border=False)
block.clear()

render_pixels(sampler, distribution, block)

film.put(block)

############################################################
# Drawing the film:

import matplotlib.pyplot as plt
from mitsuba.core import Bitmap, Struct
block_rgb = np.array(film.bitmap().convert(Bitmap.PixelFormat.RGB, Struct.Type.Float32, srgb_gamma=False))
plt.imshow(block_rgb)
plt.title("Film")
plt.show()

Crashes:

Ah, I actually managed to fix Crash 2, I guess the problem is with the active argument of the function draw_pixel. This is the original part of the code that crashes:

def draw_pixel(uv: mitsuba.core.Point2f, wavelengths: Spectrum, weights: Spectrum, block: ImageBlock, active = True):
    xyz = mitsuba.core.spectrum_to_xyz(weights, wavelengths, active)
    aovs = list(xyz) + [0.0, 1.0]
    block.put(uv, aovs, active)

def render_pixels(sampler, distribution, block: ImageBlock):
    active = Mask(True)
    remaining_bounces = UInt32(1000)

    loop = get_module(Float).Loop("TestLoop12345")
    loop.put(active, remaining_bounces)
    sampler.loop_register(loop)
    loop.init()
    while loop(active):
        wavelengths, weights = distribution.sample_spectrum(mitsuba.render.SurfaceInteraction3f(), sampler.next_1d())
        draw_pixel(32 * sampler.next_2d(), wavelengths, weights, block)
        remaining_bounces -= 1
        active &= remaining_bounces > 0

Here is a version that does not crash, which passes the active variable:

def draw_pixel(uv: mitsuba.core.Point2f, wavelengths: Spectrum, weights: Spectrum, block: ImageBlock, active):
    xyz = mitsuba.core.spectrum_to_xyz(weights, wavelengths, active)
    aovs = list(xyz) + [0.0, 1.0]
    block.put(uv, aovs, active)

def render_pixels(sampler, distribution, block: ImageBlock):
    active = Mask(True)
    remaining_bounces = UInt32(1000)

    loop = get_module(Float).Loop("TestLoop12345")
    loop.put(active, remaining_bounces)
    sampler.loop_register(loop)
    loop.init()
    while loop(active):
        wavelengths, weights = distribution.sample_spectrum(mitsuba.render.SurfaceInteraction3f(), sampler.next_1d())
        draw_pixel(32 * sampler.next_2d(), wavelengths, weights, block, active)
        remaining_bounces -= 1
        active &= remaining_bounces > 0

Hi @tomasiser ,

Thanks for reporting those two crashes.

The first one is definitely related to the fact that you are not allowed to run reverse AD (a.k.a. ek.backward) on a symbolic loop. Currently, existing solutions to this limitation are the following:

wavefront mode: turn you ek.Loop into a simple python loop and make sure to evaluate the loop body variables at the end of each iteration, e.g. here
```
sampler.schedule_state()
ek.eval(active, remaining_bounces)
```
Every iteration of that loop should produce the same kernel, hence being able to rely on the kernel caching mechanism. You can easily check this but increasing the jit log level:
```
ek.set_log_level(3)
```
Wrap the ek.Loop into a custom AD operation and manually implement the adjoint function. There is a example of this in the enoki documentation (next), I would recommend you take a look. You should be able to build the documentation locally.

Regarding the second crash, what's happening is that remaining_bounces is garbage collected at the end of render_pixels, which is going to trigger a decrement in the reference counting of enoki-jit, deleting this variable completely in the jit. But it is still needed in the loop body.

When passing active (which depends on remaining_bounces) to draw_pixel, it "binds" that variable to the image block, and therefore the reference counting of enoki-jit will figure out that this variable shouldn't be deleted (will be when we delete the block).

Haven't come up with a proper solution to this issue yet.

Thank you, @Speierers! I think I did not understand the wavefront solution well enough. When I try what you suggested, Enoki probably just executes and unrolls the whole loop with the ek.any(active) and it is super slow to compile the kernel, much slower than in the old non-next Mitsuba 2.

Here is a code that does 50 "bounces" (so, writes to a random pixel 50 times), and I do 5 iterations of this while increasing the distribution in the red part of the spectrum, so the film turns more and more red (which simulates a numerical optimization).

The code works in both the old Mitsuba 2 and Mitsuba 2 next if you change the boolean at the top.

	Old Mitsuba 2	Mitsuba 2 next
50 bounces	1.1 s / iteration	2.3 s / iteration
100 bounces	1.2 s / iteration	8.6 s / iteration

As you can see:

the old Mitsuba 2 is faster in the wavefront mode compared to Mitsuba 2 next,
it is like 2 orders of magnitude slower than the symbolic loop.

MITSUBA2_NEXT = True

import os
import mitsuba
mitsuba.set_variant('cuda_ad_spectral' if MITSUBA2_NEXT else 'gpu_autodiff_spectral')
from mitsuba.core import Thread, ScalarTransform4f, Float, Mask, UInt32, UInt64, Vector2f, Spectrum
from mitsuba.core.xml import load_file, load_dict, load_string
from mitsuba.render import ImageBlock
import numpy as np
from mitsuba.python.util import traverse
from typing import List, Set, Tuple, Dict
import enoki as ek

############################################################
# Helper functions:

def get_class(name):
    name = name.split('.')
    value = __import__(".".join(name[:-1]))
    for item in name[1:]:
        value = getattr(value, item)
    return value

def get_module(class_):
    return get_class(class_.__module__)

############################################################
# Initializing:

seed = 42
spp = 1

sampler = load_dict({"type": "independent", "sample_count": spp, "seed": seed})
sampler.seed(seed, wavefront_size=1)
# print(help(sampler))

distribution = load_dict({
    "type": "regular",
    "lambda_min": 400.0,
    "lambda_max": 700.0,
    "values": "0.1,0.1,0.1,0.1,0.1",
})

film = load_dict({
    "type": "hdrfilm",
    "banner": False,
    "rfilter": {"type": "box"},
    "width": 8,
    "height": 8,
})

############################################################
# Drawing:

def draw_pixel(uv: mitsuba.core.Point2f, wavelengths: Spectrum, weights: Spectrum, block: ImageBlock, active):
    xyz = mitsuba.core.spectrum_to_xyz(weights, wavelengths, active)
    aovs = list(xyz) + [0.0, 1.0]
    block.put(uv, aovs, active)

def render_pixels(sampler, distribution, block: ImageBlock):
    active = Mask(True)
    remaining_bounces = UInt32(50)

    while ek.any(active):
        wavelengths, weights = distribution.sample_spectrum(mitsuba.render.SurfaceInteraction3f(), sampler.next_1d())
        draw_pixel(8 * sampler.next_2d(), wavelengths, weights, block, active)
        remaining_bounces -= 1
        active &= remaining_bounces > 0

        if MITSUBA2_NEXT:
            sampler.schedule_state()
            ek.eval(active, remaining_bounces, distribution, wavelengths, weights)

############################################################
# Loop:

from timeit import default_timer as timer

params = traverse(distribution)
params.keep(["values"])
if MITSUBA2_NEXT:
    ek.enable_grad(params["values"])
else:
    ek.set_requires_gradient(params["values"])

for i in range(5):

    print("Iteration", i)
    start = timer()

    film.prepare(["XYZAW"[i] for i in range(5)])

    block = ImageBlock(size=film.crop_size(), channel_count=5,
                       filter=film.reconstruction_filter(), warn_negative=False,
                       warn_invalid=False, border=False)
    block.clear()

    render_pixels(sampler, distribution, block)

    film.put(block)

    data = block.data()

    values_total = ek.hsum_async(data) if MITSUBA2_NEXT else ek.hsum(data)

    ek.backward(values_total)
    grad = ek.grad(params["values"]) if MITSUBA2_NEXT else ek.gradient(params["values"])

    print("Gradient:", grad)
    print("Total time taken:", timer() - start, "seconds")

    # Update the distribution for the next iteration:
    new_values = ek.detach(params["values"])
    new_values[3:] *= 4.0
    params["values"] = new_values
    if MITSUBA2_NEXT:
        ek.enable_grad(params["values"])
    else:
        ek.set_requires_gradient(params["values"])
    params.update()

    import matplotlib.pyplot as plt
    from mitsuba.core import Bitmap, Struct
    block_rgb = np.array(film.bitmap().convert(Bitmap.PixelFormat.RGB, Struct.Type.Float32, srgb_gamma=False))
    plt.imshow(block_rgb)
    plt.title("Film")
    plt.show()

With enoki-jit, you can set some global flags to turn ON/OFF some of the features. For instance, to enable wavefront rendering, you can turn off the symbolic loop and symbolic virtual function calls like this:

jit_set_flag(JitFlag::VCallRecord, false);
jit_set_flag(JitFlag::LoopRecord, false);

or in python

ek.set_flag(ek.JitFlag.VCallRecord, 0)
ek.set_flag(ek.JitFlag.LoopRecord, 0)

with this, there is no need to change any of the code in your integrator. The ek.Loop will be unrolled automatically when turning off those features.

It could be that in your case, you are currently using a mix of wavefront/symbolic, running slower than it should.

I would recommend you take a look at path.cpp and mitsuba.cpp (e.g. the -w and -ww options).

From our experience, the wavefront version of next is still much faster than the old mitsuba 2 codebase.

Thank you for explaining the Enoki and Mitsuba flags!

From our experience, the wavefront version of next is still much faster than the old mitsuba 2 codebase.

I believe you, but do you mean just the forward pass, or also the backward pass in wavefront mode?

I actually suspect that the problem might be related to how image writing operations are handled in light tracing, that is why I wrote this sample code that writes to the image buffer in every "bounce". I think that somehow backpropagating through the scatter operations to the image memory that are triggered in block.put() are inefficient now compared to the old codebase, and my benchmarks above should show that.

Do you think it is possible?

I think it's fair to close this issue now because I originally thought it was just one problem, but it turned out it was actually separate issues, some of which have been already fixed. The state of the Mitsuba 2 and Enoki codebases have since changed substantially, so if I ever run into similar problems again, I will create a new issue! Thanks @Speierers for having a look into this!

mitsuba-renderer / mitsuba3