mitsuba-renderer / mitsuba3

Mitsuba 3: A Retargetable Forward and Inverse Renderer
https://www.mitsuba-renderer.org/
Other
2.1k stars 246 forks source link

Invalid PTX when rendering two different scenes #1349

Open dvicini opened 1 month ago

dvicini commented 1 month ago

We ran into an unexpected issue in a context where different scenes are rendered and there isn't a 100% guarantee that they are not alive at the same time.

Specifically, the following code produces an error:

import drjit as dr
import mitsuba as mi
import numpy as np

mi.set_variant('cuda_ad_rgb')

scene = mi.load_dict({
    'type': 'scene',
    'sensor': {'type': 'perspective'},
    'integrator': {'type': 'direct'},
    'cube': {
        'type': 'cube',
        'emitter': {
            'type': 'area',
            'radiance': {
                'type': 'bitmap',
                'bitmap': mi.Bitmap(np.ones((4, 4, 3)).astype(np.float32)),
            },
        },
    },
})

params = mi.traverse(scene)
image = mi.render(scene, spp=1)
dr.eval(image)

scene = mi.load_dict(mi.cornell_box())
result = mi.render(scene, spp=1)
dr.eval(result)

There are two things that happen with this code. First of all, I get warnings

jit_eval(): more than one OptiX pipeline was used within a single kernel, which is not supported. Please split your kernel into smaller parts (e.g. using `dr::eval()`). Disabling this ray tracing operation to avoid potential undefined behavior.
jit_eval(): more than one OptiX shader binding table was used within a single kernel, which is not supported. Please split your kernel into smaller parts (e.g. using `dr::eval()`). Disabling this ray tracing operation to avoid potential undefined behavior.
jit_eval(): more than one OptiX pipeline was used within a single kernel, which is not supported. Please split your kernel into smaller parts (e.g. using `dr::eval()`). Disabling this ray tracing operation to avoid potential undefined behavior.
jit_eval(): more than one OptiX shader binding table was used within a single kernel, which is not supported. Please split your kernel into smaller parts (e.g. using `dr::eval()`). Disabling this ray tracing operation to avoid potential undefined behavior.

After that, I get invalid PTX. My guess is that this "disabling" of the ray tracing operation does not quite work as expected:

COMPILE ERROR: Invalid PTX input: ptx2llvm-module-001: error: Failed to parse input PTX string
ptx2llvm-module-001, line 1231; error   : Unknown symbol 'l_masked_35'
ptx2llvm-module-001, line 1757; error   : Unknown symbol 'l_masked_155'
Cannot parse input PTX string

The problem seems to related to the inner Scene object that is created to evaluate the mesh parameterization of the textured area emitter. I think the problem could be that the virtual call of the textured emitter still uses the SBT & pipeline of the first scene, but is being compiled with the second scene for the second rendering. But i am not 100% sure of this yet.

In summary: 1) the textured emitter of the first scene unintentionally affects ray tracing of the second scene and produces those warnings and 2) the disabling of the ray tracing seems to have a bug itself.

I've also attached the complete PTX output that gets logged.

optix.log

dvicini commented 1 month ago

As a first step, we can at least fix the PTX that is generated to not be invalid: https://github.com/mitsuba-renderer/drjit-core/pull/105

The question remains if there would be some better solution. I.e., somehow updating the SBT the the inner scene refers to?

njroussel commented 2 weeks ago

Hi @dvicini

This has always been awkward and broken. I'm not totally sure what's the best way to fix it, but what seems evident is that it would require some user directives to disambiguate which subset of targets the JIT should be considering. I'll put this on our backlog.

FWIW, in your example, the first scene should be deleted if you delete params.