mitsuba-renderer / drjit

Dr.Jit — A Just-In-Time-Compiler for Differentiable Rendering
BSD 3-Clause "New" or "Revised" License
593 stars 43 forks source link

Critical Dr.Jit compiler failure: cuda_check(): API error 0002 (CUDA_ERROR_OUT_OF_MEMORY) #148

Closed owaranainatsu closed 1 year ago

owaranainatsu commented 1 year ago

Here is the error message when import drjit:

Critical Dr.Jit compiler failure: cuda_check(): API error 0002 (CUDA_ERROR_OUT_OF_MEMORY): "out of memory" in /project/ext/drjit-core/src/cuda_core.cpp:171. Aborted (core dumped)

However, I think there is still enough memory.

image
njroussel commented 1 year ago

Hi @owaranainatsu

This is surprising, the error is happening during the setup of the CUDA backend. Could you try "hiding" one GPU from Dr.Jit by using the CUDA_VISIBLE_DEVICES environment variable?

This section of the code that hasn't changed in quite some time - definitely not recently. You mentioned in #147 that you could previously use DrJit without this issue. I therefore believe that this issue is tied to something in your environment/setup/configuration rather than Dr.Jit itself.

owaranainatsu commented 1 year ago

Thank you for you reply! I modified my code like this:

import os 
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
import drjit as dr

Then I got error message like:

jit_kernel_load(): cache file "/home/arrow/.drjit/f3b2e15858262a99b8b532d3ba2724a2.cuda.bin" is from an incompatible version of Dr.Jit. You may want to wipe your ~/.drjit directory. jit_kernel_load(): cache file "/home/arrow/.drjit/f3b2e15858262a99b8b532d3ba2724a2.cuda.bin" is from an incompatible version of Dr.Jit. You may want to wipe your ~/.drjit directory. jit_kernel_write(): could not link cache file "/home/arrow/.drjit/f3b2e15858262a99b8b532d3ba2724a2.cuda.bin" into file system: File exists jit_kernel_write(): could not link cache file "/home/arrow/.drjit/f3b2e15858262a99b8b532d3ba2724a2.cuda.bin" into file system: File exists

njroussel commented 1 year ago

These are not error messages, just additional information. As indicated, you should delete your ~/.drjit directory.

owaranainatsu commented 1 year ago

Thank you very much!