Closed nkyriazis closed 4 years ago
I get the exact same set of errors too! I installed the mitsuba2 version from https://github.com/loubetg/mitsuba2.
My system is: Ubuntu 16.04 with Titan V GPU, OptiX6.5.
Hi,
Could you try to decrease the amount of the sample per pixel when rendering with the pathreparam
integrator?
Thanks for the input, @Speierers. I tried reducing spp from 16 to 4 in optim_pose.py
, and it seems to run without any issue. But throws the cudaErrorMemoryAllocation
error even if I increase spp to 8.
Is this a known issue with the pathreparam
integrator? And to work around using very low spp (and noisy rendering and gradients), should we use something like the samples_per_pass
construct?
I have a scene where I want to optimize object vertices as in optim_pose.py
. However small I keep the sample_count value, I keep running out of memory. I either get an error that says:
RuntimeError: cuda_malloc(): out of memory!
or my Python Kernel dies. I need relatively high spp to get a clean rendering of my scene. Is there any workaround where I can use the pathreparam
integrator on this scene? I am including a simple working example of my code (adapted from optim_pose.py
) here: example_pathreparam.zip.
Also, is there any way to set which GPU device Mitusba should run on? I have multiple GPU cards, some with more memory than others, so I want to use the one with maximum memory for Mitsuba.
The current version of the pathreparam
integrator uses a lot of GPU memory unfortunately. We are working hard on improving this.
Did you consider using the re-attach
trick mentioned in the Mitsuba 2 paper? You could render high SPP image without gradients and attach gradients computed with lower SPP. This might help a bit in your case.
The next version of enoki will be much more flexible with regarding to choosing which GPU device Mitsuba should run on. IIRC the current version doesn't provide such feature.
I am going to close this as it isn't really an issue, but rather a fact: pathreparam
uses a lot of GPU memory..
Thanks for your inputs about the memory issues. I was not able to find the re-attach
trick in the Mitusba 2 paper. Could you please point me to the correct section/page in the paper where I can find it?
Also, if my understanding is correct, using samples_per_pass
in the gpu
mode helps with reducing memory requirements for the forward pass. Does this trick also help with the memory issue for gradient computation?
The re-attach
trick is explained in the supplemental material.
Indeed samples_per_pass
should help reducing the memory requirements in the non-differential modes. With autodiff, a transcript of the computation is recorded to enable backpropagation. This transcript uses a lot of memory on the GPU which is part of your problem. When using samples_per_pass
the transcript are "accumulated" between the different passes so I doubt it will really help here.
However, on the Python side, you can render multiple "passes" with a lower amount of SPP and accumulate the gradients directly. This way the transcript isn't "shared" between the different passes.
@Speierers - thanks a lot for your explanation. All this makes a lot of sense and will definitely be useful for me.
Thank your for sharing the examples.
I followed the instructions to build mitsuba2 with the gpu_autodiff_rgb backend. My system is Ubuntu 20.04 with a Titan V GPU.
I'm attaching the results, which show that only a few examples go through successfully. The rest are failing either due to insufficient memory or a "variable" error.
Results.