Closed wet-dog closed 7 months ago
With #define PYBIND11_DETAILED_ERROR_MESSAGES
the errors are:
llvm_ad_spectral
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[25], [line 22](vscode-notebook-cell:?execution_count=25&line=22)
[1](vscode-notebook-cell:?execution_count=25&line=1) scene = mi.load_dict({
[2](vscode-notebook-cell:?execution_count=25&line=2) 'type': 'scene',
[3](vscode-notebook-cell:?execution_count=25&line=3) 'integrator': {
(...)
[19](vscode-notebook-cell:?execution_count=25&line=19) }
[20](vscode-notebook-cell:?execution_count=25&line=20) })
---> [22](vscode-notebook-cell:?execution_count=25&line=22) image = mi.render(scene)
[24](vscode-notebook-cell:?execution_count=25&line=24) mi.Bitmap(image).convert(srgb_gamma=True)
File [~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:522](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:522), in render(scene, params, sensor, integrator, seed, seed_grad, spp, spp_grad)
[518](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:518) elif seed_grad == seed:
[519](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:519) raise Exception('The primal and differential seed should be different '
[520](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:520) 'to ensure unbiased gradient computation!')
--> [522](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:522) return dr.custom(_RenderOp, scene, sensor, params, integrator,
[523](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:523) (seed, seed_grad), (spp, spp_grad))
File [~/repos/mitsuba3_copy/build/python/drjit/router.py:5776](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5776), in custom(cls, *args, **kwargs)
[5774](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5774) output = inst.eval(*_dr.detach(kwargs['args']))
[5775](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5775) else:
-> [5776](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5776) output = inst.eval(**{ k: _dr.detach(v) for k, v in kwargs.items() })
[5777](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5777) if _dr.grad_enabled(output):
[5778](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5778) raise RuntimeError("drjit.custom(): the return value of CustomOp.eval() "
...
[382](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:382) develop=True,
[383](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:383) evaluate=False
[384](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:384) )
RuntimeError: Unable to cast Python instance of type <class 'tuple'> to C++ type 'std::__1::pair<mitsuba::BSDFSample3<drjit::DiffArray<drjit::LLVMArray<float>>, mitsuba::Spectrum<drjit::DiffArray<drjit::LLVMArray<float>>, 4ul>>, mitsuba::Spectrum<drjit::DiffArray<drjit::LLVMArray<float>>, 4ul>>'
cuda_ad_spectral
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[30], [line 22](vscode-notebook-cell:?execution_count=30&line=22)
[1](vscode-notebook-cell:?execution_count=30&line=1) scene = mi.load_dict({
[2](vscode-notebook-cell:?execution_count=30&line=2) 'type': 'scene',
[3](vscode-notebook-cell:?execution_count=30&line=3) 'integrator': {
(...)
[19](vscode-notebook-cell:?execution_count=30&line=19) }
[20](vscode-notebook-cell:?execution_count=30&line=20) })
---> [22](vscode-notebook-cell:?execution_count=30&line=22) image = mi.render(scene)
[24](vscode-notebook-cell:?execution_count=30&line=24) mi.Bitmap(image).convert(srgb_gamma=True)
File [~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:522](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:522), in render(scene, params, sensor, integrator, seed, seed_grad, spp, spp_grad)
[518](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:518) elif seed_grad == seed:
[519](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:519) raise Exception('The primal and differential seed should be different '
[520](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:520) 'to ensure unbiased gradient computation!')
--> [522](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:522) return dr.custom(_RenderOp, scene, sensor, params, integrator,
[523](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:523) (seed, seed_grad), (spp, spp_grad))
File [~/repos/mitsuba3_copy/build/python/drjit/router.py:5776](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5776), in custom(cls, *args, **kwargs)
[5774](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5774) output = inst.eval(*_dr.detach(kwargs['args']))
[5775](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5775) else:
-> [5776](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5776) output = inst.eval(**{ k: _dr.detach(v) for k, v in kwargs.items() })
[5777](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5777) if _dr.grad_enabled(output):
[5778](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/drjit/router.py:5778) raise RuntimeError("drjit.custom(): the return value of CustomOp.eval() "
...
[382](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:382) develop=True,
[383](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:383) evaluate=False
[384](https://file+.vscode-resource.vscode-cdn.net/home/wet-dog/repos/compiled_test/~/repos/mitsuba3_copy/build/python/mitsuba/python/util.py:384) )
RuntimeError: Unable to cast Python instance of type <class 'tuple'> to C++ type 'std::__1::pair<mitsuba::BSDFSample3<drjit::DiffArray<drjit::CUDAArray<float>>, mitsuba::Spectrum<drjit::DiffArray<drjit::CUDAArray<float>>, 4ul>>, mitsuba::Spectrum<drjit::DiffArray<drjit::CUDAArray<float>>, 4ul>>'
Hi @wet-dog
This is expected. In spectral variants, the return values of the BSDF.sample()
method slightly differs from what they are in RGB variants. Namely, the method should return a (mi.BSDFSample3f, mi.Spectrum)
. In RGB variants mi.Spectrum
is equivalent to mi.Color3f
. However in spectral variants, mi.Spectrum
is typically a 4-sized array (one sampled value per wavelength). This must be fixed in the plugin's method directly, Mitsuba does not provide some sort of an automatical spectral upsampling scheme.
Thanks for taking the time to explain @njroussel
I'm still confused about why scalar_rgb
doesn't work with this example. Is there a reason it doesn't?
Oh sorry, I didn't see that you tried scalar_rgb
.
Generally, this is not something that you would want to do as the rendering would become unbearably slow (see tutorial explanation). I was however surprised that it didn't work. As far as I can tell, it would require some work to make this functional but I imagine that it was never a priority as the performance issue mentioned above makes this feature somewhat useless.
Hi @njroussel, I often switch to scalar rgb for debugging. For that, it is rather helpful. Is there a way to print stuff in LLVM mode or get rid of this runtime error in scalar rgb mode? Thanks a lot!!
Hi @Vrroom,
You can use dr.printf_async()
to print values in JIT variants (LLVM, CUDA), even inside of a recorded loop without introducing a kernel boundary.
Can you give me an example. Can I, for instance, use it in the sample
function of my custom BSDF (See https://github.com/Vrroom/fabricx/blob/a427eaadadab0828a4cd0e74e3f826c19bba4f40/spongecake_bsdf.py#L205)
Here is an example, you should be able to use it anywhere in principle:
import drjit as dr
a = dr.cuda.Float32([1, 2, 3, 4])
active = a < 2.5
dr.printf_async("a[%d] = %f\n", dr.arange(dr.cuda.UInt32, dr.width(a)), a, active=active)
dr.eval(a, active)
Note that the prints will only take place once the kernel gets executed, which is why I added an explicit dr.eval()
call above.
Thanks @merlinND. I tried it out. Here is the segment from my BSDF sample
function.
# just an exponential random variable to perturb the importance weight and induce some realism
perturb_weight = dr.select(self.perturb_specular, -dr.log(1.0 - self.pcg.next_float32()), 1.0)
weight = (f_sponge_cake / bs.pdf) * perturb_weight
weight = dr.select(selected_dt, mi.Color3f(1.0, 1.0, 1.0), weight)
active = active & (dr.all(dr.isfinite(weight)))
weight = weight & active
dr.printf_async("a[%d] = %f\n", dr.arange(dr.llvm.UInt32, dr.width(perturb_weight)), perturb_weight, active=active)
return (bs, weight)
I get the following error:
LLVM ERROR: SmallVector unable to grow. Requested capacity (4294967296) is larger than maximum value for size type (4294967295)
Aborted (core dumped)
I thought that this was because I was shooting too many rays. I dialed it back down, rendering a 48 by 27 image with SPP = 1. I still get the above error. Does this mean that the variable perturb_weight
is somehow 4294967296 long?
Thanks a lot for the quick reply! Really appreciate it.
You can easily check this with print(dr.width(perturb_weight))
.
This might have to do with the vcall, where during recording the code is evaluated with placeholder variables. Does it work as expected outside of a vcall?
When I print the width, it's just 1. The example you provided works independently though (if that's what you mean by outside vcall).
I see, then how about this print (that doesn't rely on the width):
dr.printf_async("a[:] = %f\n", perturb_weight, active=active)
Didn't work. Same error. Do you think this has to do with the fact that I'm calling the printf inside the sample function? I'm sorry, I'm not super familiar with the internals of mitsuba
yet so some of the details of kernel and so on are lost on me :)
Hi @merlinND. Kindly following up on this. Let me know if I can do anything to debug this further? Thanks a lot for the help so far!!
Indeed, the problem seems to be LLVM + printf_async
inside of a virtual function call (such as your BSDF.sample()
).
@njroussel or @rtabbara, could you please try the following reproducer in the upcoming version?
def test04_custom_bsdf_printf(variants_vec_backends_once_rgb):
value = dr.sqr(dr.full(mi.Color3f, 1.1, 19))
dr.printf_async("Outside of vcall: value.x[:] = %f\n", value.x)
dr.eval(value)
# ----------
class MyBSDF(mi.BSDF):
def __init__(self, props):
mi.BSDF.__init__(self, props) # Set the BSDF flags
reflection_flags = mi.BSDFFlags.DeltaReflection | mi.BSDFFlags.FrontSide | mi.BSDFFlags.BackSide
transmission_flags = mi.BSDFFlags.DeltaTransmission | mi.BSDFFlags.FrontSide | mi.BSDFFlags.BackSide
self.m_components = [reflection_flags, transmission_flags]
self.m_flags = reflection_flags | transmission_flags
def sample(self, ctx, si, sample1, sample2, active):
bs = dr.zeros(mi.BSDFSample3f, dr.width(sample1))
value = mi.Spectrum(sample1) + 0.1
dr.printf_async("Inside of vcall: value.x[:] = %f\n", value.x, active=active)
return (bs, value)
def eval(self, ctx, si, wo, active):
return 0.0
def pdf(self, ctx, si, wo, active):
return 0.0
def eval_pdf(self, ctx, si, wo, active):
return 0.0, 0.0
def to_string(self):
return ('MyBSDF[]')
mi.register_bsdf("mybsdf", lambda props: MyBSDF(props))
scene = mi.load_dict({
'type': 'scene',
'integrator': {
'type': 'path'
},
'sphere' : {
'type': 'sphere',
'bsdf': {
'type' : 'mybsdf',
}
},
'sensor': {
'type': 'perspective',
'to_world': mi.ScalarTransform4f.look_at(origin=[0, -5, 5],
target=[0, 0, 0],
up=[0, 0, 1]),
'film': {
'type': 'hdrfilm',
'width': 16,
'height': 4,
}
}
})
image = mi.render(scene)
dr.eval(image)
On my machine with mitsuba@3013adb4
, it works on the CUDA backend but crashes with this error on LLVM:
LLVM ERROR: SmallVector unable to grow. Requested capacity (4294967296) is larger than maximum value for size type (4294967295)
Thanks @merlinND for this detailed study. This is exactly the issue I face! As an aside, I would love to use CUDA myself but can't get Optix to install correctly.
I can reproduce this issue with printf_async
inside of vcalls with the LLVM backend. I'm rather suprised: I've never seen this before and have definitely used it like this. I wonder if this is a recent regression or even tied to a newer LLVM version.
I've tried tried it on the upcoming nanobind
-based version, it worked just fine. We won't have time to look into the issue on the current codebase, you'll have to wait until the next version is out (soon). Until then you could also use print()
in LLVM mode at the cost of introducing kernel boundaries with a wavefront execution. This can be turned on with dr.set_flag(dr.JitFlag.VCallRecord, False); dr.set_flag(dr.JitFlag.LoopRecord, False)
Summary
Running the custom Python plugin notebook from https://mitsuba.readthedocs.io/en/latest/src/others/custom_plugin.html with certain variants (scalar_rgb, scalar_spectral, cuda_ad_spectral, and llvm_ad_spectral) results in RuntimeErrors. The issue appears for scalar_rgb and scalar_spectral in both a non-custom build and a custom build. The issue appears for cuda_ad_spectral and llvm_ad_spectral in a custom build. Both cuda_ad_rgb and llvm_ad_rgb work fine. Taking the code in the notebook and putting into a .py file results in the same issues.
Is there something about the custom Python plugin code that I'm missing that makes it incompatible with these variants?
System configuration
For the version of the issue with mitsuba installed with pip
System information:
OS: Ubuntu 22.04.4 LTS CPU: AMD Ryzen 7 5800X 8-Core Processor GPU: NVIDIA GeForce GTX 1080 Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] NVidia driver: 525.147.05 LLVM: 18.1.0
Dr.Jit: 0.4.4 Mitsuba: 3.5.0 Is custom build? False Compiled with: GNU 10.2.1 Variants: scalar_rgb scalar_spectral cuda_ad_rgb llvm_ad_rgb
For the version of the issue with mitsuba compiled
System information:
OS: Ubuntu 22.04.4 LTS CPU: AMD Ryzen 7 5800X 8-Core Processor GPU: NVIDIA GeForce GTX 1080 Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] NVidia driver: 525.147.05 LLVM: 18.1.0
Dr.Jit: 0.4.4 Mitsuba: 3.5.0 Is custom build? True Compiled with: Clang 18.1.0 Variants: scalar_rgb cuda_ad_rgb cuda_ad_spectral llvm_ad_rgb llvm_ad_spectral
Description
scalar_rgb and scalar_spectral error
cuda_ad_spectral and llvm_ad_spectral error
Steps to reproduce