mitsuba-renderer / drjit

Dr.Jit — A Just-In-Time-Compiler for Differentiable Rendering
BSD 3-Clause "New" or "Revised" License
593 stars 43 forks source link

Critical Dr.Jit compiler failure #121

Closed n-kubiak closed 1 year ago

n-kubiak commented 1 year ago

Hello,

I'm training a Pytorch-Mitsuba model using the @wrap_ad decorator, and during each iteration I'm rendering a scene and feeding it into my network.

Unfortunately, my code randomly crashes with Critical Dr.Jit compiler failure: jit_var(r50480700): unknown variable! Aborted (core dumped). Sometimes this will happen after a number of iterations, sometimes my model will go through a few full epochs before it eventually crashes. Since the model can complete a number of epochs, I don't see why any of the samples should be problematic (they didn't raise errors in previous epochs). Because the model can train for some amount of time, it's also hard to determine why this suddenly happens and create an error reproduction.

Could you let me know what this error means / what the common causes are?

Many thanks, NK

PS I'm using the latest versions of Mitsuba/Dr Jit

njroussel commented 1 year ago

Hi @n-kubiak

Are you using the pre-built binaries installed through pip/PyPI ?

If not, I believe we have identified this bug and already have a fix (see this comment). It would require building drjit locally until we push a new patch -- we should have one soon.

n-kubiak commented 1 year ago

Yeah, I installed mitsuba/drjit via pip install

njroussel commented 1 year ago

A new release (v0.4.1) is available with the aforementioned patch. A new Mitsuba version which is tied to this Dr.Jit release will be available soon too (few hours).

I'll close this issue for now. Please keep this thread updated if this did not fix your problem -- I'll reopen the issue.

h-OUS-e commented 1 year ago

I just wanted to mention that this started happening to me using the ptracer as well. After around 70 iterations it crashed with the same error. I tried installing the new versions of mitsuba and drjit with pip in a new conda environment on Windows, but I still got the problem. Is there any workaround for this? It seems to run just fine with the caustic optimization tutorial, but not sure why it crashes with my other optimization setup as it is very similar; my optimization setup tho does become slower and slower with each iteration. Would that be a reason?

njroussel commented 1 year ago

@h-OUS-e could you open a new issue on Mitsuba3 ? The optimzation loop shouldn't become slower and slower, I can at least help you debut that. The crash is a bit more concerning, but hopefully we'll figure that out at the same time :smile: