Open Cerf-Volant425 opened 1 year ago
Firstly, try to use following commands
export XLA_PYTHON_CLIENT_PREALLOCATE=false
export XLA_FLAGS="--xla_gpu_strict_conv_algorithm_picker=false --xla_gpu_force_compilation_parallelism=1"
If they do not works, try to reduce batch_size or upgrade your cudnn to a higher version >=8.6.0
. Hope this will be helpful.
In addition to above comments, in your error log, you have a mismatch between pre-compiled jaxlib cuDNN version and the cuDNN version you have installed (8.1.0 versus 8.6.0). See here for details in how to align the version of cuDNN.
Nevertheless, after all I end up with: INTERNAL: Failed to load in-memory CUBIN: CUDA_ERROR_OUT_OF_MEMORY: out of memory
.
After configuring the environment, there are always errors as below when training and testing the LLFF dataset of the scene FLOWER.
My setting is:
jaxlib: 0.4.1+cuda11.cudnn86
Can you give me some suggestions to avoid these errors, thanks in advance.