Open qazwsxal opened 1 year ago
Hi Adam,
I ran these commands both with the mitusba 3.3.0 pre-built binary as well as the latest from source. In both cases however, I wasn't able to reproduce the issue you were getting with the cuda variant. My Nvidia driver version is also 525.125.06
One data point that might be useful is whether you still get these artifacts when running the llvm variant.
Another thing I've noticed is that the screenshot you provided seems to have a higher spp count than when I simply run mitsuba -m variant out.xml
. Just to make sure, are there any additional options that were supplied to generate the image or perhaps the scene differs slightly to what was provided?
Hi Rami,
Thanks for getting back to me, apologies for attaching a screenshot from another run with more samples, the problem appears no matter the number of samples specified. I've ran mitsuba again with the three rgb variants, and the issue is only present on the cuda version. mitsuba_rendering_errors.zip
I've attached a zip with the generated exr images, including a version with optimisations disable -O0
. Unfortunately it's still occuring with optimisations disabled.
I was also able to run the same scene on an Nvidia A100 with the same driver version, in which the rendering error did not occur:
System information:
OS: Ubuntu 22.04.2 LTS CPU: Intel(R) Xeon(R) CPU @ 2.20GHz GPU: NVIDIA A100-SXM4-40GB Python: 3.9.17 (main, Jul 5 2023, 20:41:20) [GCC 11.2.0] NVidia driver: 525.125.06 LLVM: -1.-1.-1
Dr.Jit: 0.4.2 Mitsuba: 3.3.0 Is custom build? False Compiled with: GNU 10.2.1 Variants: scalar_rgb scalar_spectral cuda_ad_rgb llvm_ad_rgb
Hi Adam,
Another recommendation is, if possible, changing the driver version used on the RTX 3050 to see if hopefully that makes a difference.
There is a possibility that there's an underlying misuse of the OptiX API somewhere that only manifests itself on some GPUs, however without the ablity to reproduce the issue it's a bit difficult to track down.
I am facing a similar issue when running the tutorial volume_optimization.ipynb on the cuda_ad_rgb
variant. I am, however, able to reproduce the correct results with the llvm_ad_rgb
variant originally used in the notebook.
When using the NVIDIA GTX TITAN Xp GPU, I observe a half-black voxel artifact as shown below. These are from the "Intermediate results" section in the notebook.
When using the NVIDIA RTX 3050Ti (Mobile) GPU, it is as expected without any artifacts.
Note: Both systems use mitsuba-3.5.0
and drjit-0.4.4
provided in PyPI.
OS: Ubuntu 20.04.6 LTS x86_64 Kernel: 5.15.0-83-generic CPU: Intel i7-8086K (12) @ 5.000GHz GPU: NVIDIA TITAN Xp GPU: NVIDIA GeForce GTX TITAN X GPU Driver: NVIDIA 535.104.05 CUDA Version: 11.7
OS: Arch Linux x86_64 Host: Victus by HP Laptop 16-e0xxx Kernel: Linux 6.6.6-arch1-1 CPU: AMD Ryzen 7 5800H with Radeon Graphics (16) @ 4.4GHz GPU: NVIDIA GeForce RTX 3050 Ti Mobile GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series GPU Driver: NVIDIA 545.29.06 CUDA Version: 12.1
UPDATE: Apparently something is wrong with the TITAN Xp GPU. When I wrote a custom dr.wrap_ad function for volume optimization, I consistently got NaNs as the first few elements of the grad tensor. This problem disappears when using any RTX GPU.
Summary
High number of spheres (~800) results in black half-shading when using
cuda_ad_rgb
rendererSystem configuration
System information:
OS: Ubuntu 22.04.2 LTS CPU: 12th Gen Intel(R) Core(TM) i7-12700H GPU: NVIDIA GeForce RTX 3050 Ti Laptop GPU Python: 3.9.17 (main, Jul 5 2023, 20:41:20) [GCC 11.2.0] NVidia driver: 525.125.06 LLVM: -1.-1.-1
Dr.Jit: 0.4.2 Mitsuba: 3.3.0 Is custom build? False Compiled with: GNU 10.2.1 Variants: scalar_rgb scalar_spectral cuda_ad_rgb llvm_ad_rgb
Description
the
cuda_ad_rgb
renderer produces incorrect results when shading large numbers of diffuse spheres with the "constant" light sourceSteps to reproduce
mitsuba -m scalar_rgb out.xml
produces EXR file with correct shadingmitsuba -m cuda_ad_rgb out.xml
produces EXR file with incorrect shadingzip containing xml in question: out.zip