mfinzi / equivariant-MLP

A library for programmatically generating equivariant layers through constraint solving
MIT License
251 stars 21 forks source link

Paper results error #7

Closed StellaAthena closed 3 years ago

StellaAthena commented 3 years ago

None of the experiments from the paper will run for me, and give

2021-05-16 03:52:49.141806: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2021-05-16 03:52:49.141858: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/gemm_algorithm_picker.cc:113] Check failed: stream->parent()->GetBlasGemmAlgorithms(&algorithms) 

I can write my own code using the package without a problem, but your experiments won't work.

mfinzi commented 3 years ago

Hi @StellaAthena,

I expect you are encountering this error because of having two Jax instances running at the same time (e.g. a notebook with a kernel running in the background and running the expts from the command line). By default when you use Jax, 90% of the GPU memory is preallocated and a 2nd process will fail when trying to allocate the memory again. To verify that this is the issue, check the GPU memory usage before running the script. If it is already ~90%, the fix is to have only one instance per GPU running at a time or alternatively to change the memory allocation environment variables here with e.g. export XLA_PYTHON_CLIENT_PREALLOCATE=false

StellaAthena commented 3 years ago

Thanks! Worked like a charm

StellaAthena commented 3 years ago

Thanks! Worked like a charm