Closed RaulPPelaez closed 1 year ago
@mikemhenry @raimis , try the GPU runner here please.
Tests running here https://github.com/openmm/NNPOps/actions/runs/5660607367
RuntimeError: Deterministic behavior was enabled with either `torch.use_deterministic_algorithms(True)` or `at::Context::setDeterministicAlgorithms(true)`, but this operation is not deterministic because it uses CuBLAS and you have CUDA >= 10.2. To enable deterministic behavior in this case, you must set an environment variable before running your PyTorch application: CUBLAS_WORKSPACE_CONFIG=:4096:8 or CUBLAS_WORKSPACE_CONFIG=:16:8. For more information, go to https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility
Looks like I need to add this to the runner
New test here https://github.com/openmm/NNPOps/actions/runs/5661251365
From here https://docs.nvidia.com/cuda/cublas/index.html#results-reproducibility I chose the export CUBLAS_WORKSPACE_CONFIG=:4096:8
since I don't think we are tight on GPU memory
Tests pass!
I setup auto-merge, review required, and status checks. Let me know if that causes any issues @RaulPPelaez @raimis
This fixes some tests not passing on some architectures due to numerical inaccuraccies (at least on my machine).