Closed sagesimhon closed 1 year ago
Hi @sagesimhon
This is an assertion in our thread/worker pool job submission system. It's typically indicative of some race condition in higher-level code. Are you using a vanilla build of Mitsuba/Dr.Jit? These type of issues are hard to track down and fix without a consistent reproducer.
I also at some point ran into this a few months ago, but I don't recall what caused it or how I got rid of this problem (hopefully?), and I don't have a reproducer either.
yes, it's quite random, it seems to happen more frequently with higher cpus. I am using vanilla build, via conda install. Any suggestions where to start --- is there a way to get more debugging info on the root cause, or at least print the stack trace after the assertion fails? I do eventually see a python core dump message, but have no idea where to find it and how to use it.
Same here. I'm using the pip package of Mitsuba. During an animation it usually takes 500-1000 frames to crash. My machine is a Threadripper with 64 cores. This happens only with Mitsuba 3.3.0
but not with 3.2.1
.
tried:
mitsuba: 3.30 drjit: 0.4.2
and mitsuba-3.2.1 drjit-0.4.1
both fail.
with libLLVM-15.so and libLLVM-10.so
Let me close this, to keep this tidy.
Anyone who finds this issue, we're tracking it over here: https://github.com/mitsuba-renderer/mitsuba3/issues/849
Hi,
I am running various rendering jobs via mitsuba 3 and the process throws an assertion error, and a core dump, at random times in my processing stack on a linux machine with a large number of CPUs (this never happens when running identical code on a Mac).
"Assertion failed in /project/ext/drjit-core/ext/nanothread/src/queue.cpp:354: remain == 1"
I have no clue to what the cause of this issues is. Any ideas where to start?