mitsuba-renderer / mitsuba3

Mitsuba 3: A Retargetable Forward and Inverse Renderer
https://www.mitsuba-renderer.org/
Other
2.1k stars 246 forks source link

CUDA_ERROR_ILLEGAL_ADDRESS error in running Shape Optimization tutorial with `constant` environment emitter. #1139

Closed stymbhrdwj closed 5 months ago

stymbhrdwj commented 7 months ago

Description

I am running the Shape Optimization tutorial provided here where I changed the lighting in the scene to constant environment emitter. I am using the cuda_ad_rgb variant. I encounter the following error after 50 iterations of the Large Steps optimziation.

Critical Dr.Jit compiler failure: cuda_check(): API error 0700 (CUDA_ERROR_ILLEGAL_ADDRESS): "an illegal memory access was encountered" in /project/ext/drjit-core/src/malloc.cpp:340.
Aborted (core dumped)

When restricting the optimization to 50 iterations, it works, but running just one iteration after remeshing gives the same error. On a side note, the GPU utilization is extremely low during the optimization process, ~2% volatile GPU utilization as reported by nvidia-smi and each iteration takes about 20 seconds to complete, which seems very slow.

Without any modifications to the lighting in the scene, the code runs as expected with high GPU utilization and fast runtime, each iteration taking about 0.7 seconds before remeshing, and 2.5 seconds after.

System configuration

System information:

OS: Ubuntu 20.04.6 LTS CPU: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz GPU: NVIDIA GeForce RTX 4090 Python: 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] NVidia driver: 530.30.02 LLVM: 12.0.0

Dr.Jit: 0.4.4 Mitsuba: 3.5.0 Is custom build? False Compiled with: GNU 10.2.1 Variants: scalar_rgb scalar_spectral cuda_ad_rgb llvm_ad_rgb

I am using a miniconda3 environment in which I installed Mitsuba and Dr.Jit using pip.

EDIT: I can reproduce this issue on my laptop with the following configuration.

System information:

OS: Arch Linux CPU: AMD Ryzen 7 5800H with Radeon Graphics GPU: NVIDIA GeForce RTX 3050 Ti Laptop GPU Python: 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0] NVidia driver: 550.67 LLVM: 17.0.6

Dr.Jit: 0.4.4 Mitsuba: 3.5.0 Is custom build? False Compiled with: GNU 10.2.1 Variants: scalar_rgb scalar_spectral cuda_ad_rgb llvm_ad_rgb

Steps to reproduce

  1. Modify the emitter in scene_dict to
    'emitter': { 'type': 'constant' }
  2. Export the notebook to a python script and run on the terminal to catch the error
  3. Optionally, add tqdm to monitor the runtime of each iteration before and after making the change in scene_dict.
njroussel commented 5 months ago

Hi @stymbhrdwj

Thanks for the detailed issue. I've fixed the significant performance regression you noticed in this commit: https://github.com/mitsuba-renderer/mitsuba3/commit/deebe4c64586c129bb0b0280bbaf376e2315991c

I believe the illegal memory issue has already been fixed, but is not available in the pre-built binaries - you need to compile the project yourself.

stymbhrdwj commented 5 months ago

Thanks for the fix. I had managed to side-step this issue by creating a constant.exr environment map that is all ones. It gave the same results as constant illumination but with improved performance.