kalininalab / alphafold_non_docker

AlphaFold2 non-docker setup
325 stars 119 forks source link

Error: Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9 #75

Open shadowdeng1994 opened 3 months ago

shadowdeng1994 commented 3 months ago

I got the following error when I tried to run run_alphafold.sh. How could I fix it?

2024-04-11 15:12:17.547721: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:231] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9
2024-04-11 15:12:17.547768: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas
2024-04-11 15:12:17.552686: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found
2024-04-11 15:12:17.552762: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function
Traceback (most recent call last):
  File "/run/determined/workdir/AlphaFold/alphafold-2.3.1/run_alphafold.py", line 432, in <module>
    app.run(main)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/run/determined/workdir/AlphaFold/alphafold-2.3.1/run_alphafold.py", line 408, in main
    predict_structure(
  File "/run/determined/workdir/AlphaFold/alphafold-2.3.1/run_alphafold.py", line 199, in predict_structure
    prediction_result = model_runner.predict(processed_feature_dict,
  File "/run/determined/workdir/AlphaFold/alphafold-2.3.1/alphafold/model/model.py", line 167, in predict
    result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/random.py", line 132, in PRNGKey
    key = prng.seed_with_impl(impl, seed)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 267, in seed_with_impl
    return random_seed(seed, impl=impl)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 580, in random_seed
    return random_seed_p.bind(seeds_arr, impl=impl)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 329, in bind
    return self.bind_with_trace(find_top_trace(args), args, params)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace
    out = trace.process_primitive(self, map(trace.full_raise, args), params)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive
    return primitive.impl(*tracers, **params)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 592, in random_seed_impl
    base_arr = random_seed_impl_base(seeds, impl=impl)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base
    return seed(seeds)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 832, in threefry_seed
    lax.shift_right_logical(seed, lax_internal._const(seed, 32)))
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical
    return shift_right_logical_p.bind(x, y)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 329, in bind
    return self.bind_with_trace(find_top_trace(args), args, params)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace
    out = trace.process_primitive(self, map(trace.full_raise, args), params)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive
    return primitive.impl(*tracers, **params)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive
    return compiled_fun(*args)
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/dispatch.py", line 200, in <lambda>
    return lambda *args, **kw: compiled(*args, **kw)[0]
  File "/opt/conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled
    out_flat = compiled.execute(in_flat)
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function

nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Mon_May__3_19:15:13_PDT_2021 Cuda compilation tools, release 11.3, V11.3.109 Build cuda_11.3.r11.3/compiler.29920130_0

nvidia-smi

Thu Apr 11 15:27:29 2024
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:24:00.0 Off | Off | | 30% 29C P8 6W / 450W | 0MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

conda list | egrep "jax|cuda"

cudatoolkit 11.3.1 h2bc3f7f_2
jax 0.3.25 pypi_0 pypi jaxlib 0.3.25+cuda11.cudnn805 pypi_0 pypi

dpkg -l | grep cuda

ii cuda-command-line-tools-11-3 11.3.1-1 amd64 CUDA command-line tools ii cuda-compat-11-3 465.19.01-1 amd64 CUDA Compatibility Platform ii cuda-compiler-11-3 11.3.1-1 amd64 CUDA compiler ii cuda-cudart-11-3 11.3.109-1 amd64 CUDA Runtime native Libraries ii cuda-cudart-dev-11-3 11.3.109-1 amd64 CUDA Runtime native dev links, headers ii cuda-cuobjdump-11-3 11.3.58-1 amd64 CUDA cuobjdump ii cuda-cupti-11-3 11.3.111-1 amd64 CUDA profiling tools runtime libs. ii cuda-cupti-dev-11-3 11.3.111-1 amd64 CUDA profiling tools interface. ii cuda-cuxxfilt-11-3 11.3.58-1 amd64 CUDA cuxxfilt ii cuda-driver-dev-11-3 11.3.109-1 amd64 CUDA Driver native dev stub library ii cuda-gdb-11-3 11.3.109-1 amd64 CUDA-GDB ii cuda-libraries-11-3 11.3.1-1 amd64 CUDA Libraries 11.3 meta-package ii cuda-libraries-dev-11-3 11.3.1-1 amd64 CUDA Libraries 11.3 development meta-package ii cuda-memcheck-11-3 11.3.109-1 amd64 CUDA-MEMCHECK ii cuda-minimal-build-11-3 11.3.1-1 amd64 Minimal CUDA 11.3 toolkit build packages. ii cuda-nsight-compute-11-3 11.3.0-1 amd64 NVIDIA Nsight Compute ii cuda-nvcc-11-3 11.3.109-1 amd64 CUDA nvcc ii cuda-nvdisasm-11-3 11.3.58-1 amd64 CUDA disassembler ii cuda-nvml-dev-11-3 11.3.58-1 amd64 NVML native dev links, headers ii cuda-nvprof-11-3 11.3.111-1 amd64 CUDA Profiler tools ii cuda-nvprune-11-3 11.3.58-1 amd64 CUDA nvprune ii cuda-nvrtc-11-3 11.3.109-1 amd64 NVRTC native runtime libraries ii cuda-nvrtc-dev-11-3 11.3.109-1 amd64 NVRTC native dev links, headers ii cuda-nvtx-11-3 11.3.109-1 amd64 NVIDIA Tools Extension ii cuda-sanitizer-11-3 11.3.111-1 amd64 CUDA Sanitizer ii cuda-thrust-11-3 11.3.109-1 amd64 CUDA Thrust ii cuda-toolkit-11-3-config-common 11.3.109-1 all Common config package for CUDA Toolkit 11.3. ii cuda-toolkit-11-config-common 11.8.89-1 all Common config package for CUDA Toolkit 11. ii cuda-toolkit-config-common 12.1.105-1 all Common config package for CUDA Toolkit. hi libcudnn8 8.2.0.53-1+cuda11.3 amd64 cuDNN runtime libraries ii libcudnn8-dev 8.2.0.53-1+cuda11.3 amd64 cuDNN development libraries and headers hi libnccl-dev 2.9.9-1+cuda11.3 amd64 NVIDIA Collective Communication Library (NCCL) Development Files hi libnccl2 2.9.9-1+cuda11.3 amd64 NVIDIA Collective Communication Library (NCCL) Runtime