google-deepmind / alphafold3

AlphaFold 3 inference pipeline.
Other
5.06k stars 563 forks source link

Alphafold3 docker container can't use GPU A100 #82

Closed xiongzhiqiang closed 2 days ago

xiongzhiqiang commented 2 days ago

I have installed alphafold3 successfully in my server with GPU A100. My GPU information as follow. Wed Nov 20 03:28:50 2024
+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA A100-PCIE-40GB Off | 00000000:04:00.0 Off | 0 | | N/A 26C P0 36W / 250W | 15734MiB / 40960MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA A100-PCIE-40GB Off | 00000000:1B:00.0 Off | 0 | | N/A 26C P0 31W / 250W | 3MiB / 40960MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ When I ran one example, I found the GPU utilization is at 0% and the task take more much time than I expected. I think the alphafold3 can’t run in my GPU.

The command is here.

docker run -it --volume /home/administrator/input/:/root/af_input --volume /home/administrator/output/:/root/af_output --volume /home/administrator/Alphafold3/models/:/root/models --volume /home/administrator/Alphafold3/database/:/root/public_databases --gpus all alphafold3 python run_alphafold.py --json_path=/root/af_input/input_4.json --model_dir=/root/models --output_dir=/root/af_output

The log of run alphafold as follow.

I1120 02:49:48.834968 131374588219392 folding_input.py:1044] Detected /root/af_input/input_4.json is an AlphaFold 3 JSON since the top-level is not a list. Running AlphaFold 3. Please note that standard AlphaFold 3 model parameters are only available under terms of use provided at https://github.com/google-deepmind/alphafold3/blob/main/WEIGHTS_TERMS_OF_USE.md. If you do not agree to these terms and are using AlphaFold 3 derived model parameters, cancel execution of AlphaFold 3 inference with CTRL-C, and do not use the model parameters. I1120 02:49:49.370483 131374588219392 xla_bridge.py:895] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' I1120 02:49:49.372001 131374588219392 xla_bridge.py:895] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory Found local devices: [CudaDevice(id=0), CudaDevice(id=1)] Building model from scratch... Processing 1 fold inputs. Processing fold input input_4 ... I1120 02:49:55.234390 131332278654528 subprocessutils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpqpui6f7/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpqpui6f7_/query.fasta /root/public_databases/uniref90_2022_05.fa" I1120 02:49:55.235006 131332253476416 jackhmmer.py:78] Query sequence: VDNKFNKEADRAWEEIRNLPNLNGWQMTAFIASLVDDPSQSANLLAEAKKLNDAQAPK I1120 02:49:55.235408 131332270261824 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpd2ezovzy/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpd2ezovzy/query.fasta /root/public_databases/mgy_clusters_2022_05.fa" I1120 02:49:55.235584 131332261869120 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmphplg9_9j/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmphplg9_9j/query.fasta /root/public_databases/bfd-first_non_consensus_sequences.fasta" I1120 02:49:55.259398 131332253476416 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpx8wqnhtt/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpx8wqnhtt/query.fasta /root/public_databases/un iprot_all_2021_04.fa" I1120 02:51:49.176093 131332261869120 subprocess_utils.py:97] Finished Jackhmmer in 113.940 seconds I1120 02:57:02.232356 131332278654528 subprocess_utils.py:97] Finished Jackhmmer in 426.998 seconds I1120 03:00:56.632969 131332253476416 subprocess_utils.py:97] Finished Jackhmmer in 661.373 seconds I1120 03:06:10.623394 131332270261824 subprocess_utils.py:97] Finished Jackhmmer in 975.388 seconds I1120 03:06:10.626955 131374588219392 pipeline.py:73] Getting protein MSAs took 975.40 seconds for sequence VDNKFNKEADRAWEEIRNLPNLNGWQMTAFIAS LVDDPSQSANLLAEAKKLNDAQAPK I1120 03:06:10.627211 131374588219392 pipeline.py:79] Deduplicating MSAs and getting protein templates for sequence VDNKFNKEADRAWEEIRNLPNLNGW QMTAFIASLVDDPSQSANLLAEAKKLNDAQAPK I1120 03:06:10.638924 131332270261824 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/hmmbuild --informat stockholm --hand --amino / tmp/tmpdl8gd8d7/output.hmm /tmp/tmpdl8gd8d7/query.msa" I1120 03:06:10.690782 131332270261824 subprocess_utils.py:97] Finished Hmmbuild in 0.052 seconds I1120 03:06:10.691647 131332270261824 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/hmmsearch --noali --cpu 8 --F1 0.1 --F2 0.1 -- F3 0.1 -E 100 --incE 100 --domE 100 --incdomE 100 -A /tmp/tmpkh29a4wc/output.sto /tmp/tmpkh29a4wc/query.hmm /root/public_databases/pdb_seqres _2022_09_28.fasta" I1120 03:06:14.366872 131332270261824 subprocess_utils.py:97] Finished Hmmsearch in 3.675 seconds I1120 03:06:53.882448 131374588219392 pipeline.py:108] Deduplicating MSAs and getting protein templates took 43.25 seconds for sequence VDNKF NKEADRAWEEIRNLPNLNGWQMTAFIASLVDDPSQSANLLAEAKKLNDAQAPK I1120 03:07:01.471262 131374588219392 pipeline.py:165] processing input_4, random_seed=1 I1120 03:07:01.489283 131374588219392 pipeline.py:258] Calculating bucket size for input with 192 tokens. I1120 03:07:01.489473 131374588219392 pipeline.py:264] Got bucket size 256 for input with 192 tokens, resulting in 64 padded tokens. Featurising input_4 with rng_seed 1 took 3.01 seconds. Featurising data for seeds (1,) took 10.50 seconds. Running model inference for seed 1... Running model inference for seed 1 took 53.03 seconds. Extracting output structures (one per sample) for seed 1... Extracting output structures (one per sample) for seed 1 took 0.24 seconds. Running model inference and extracting output structures for seed 1 took 53.27 seconds. Running model inference and extracting output structures for seeds (1,) took 53.27 seconds. Writing outputs for input_4 for seed(s) (1,)... Done processing fold input input_4. Done processing 1 fold inputs.

Anyone can help me to find out why. Thank you very much.

jacobjinkelly commented 2 days ago

Looking at the logs, it appears the GPU was found, and model inference ran on the GPU.

The line Found local devices: [CudaDevice(id=0), CudaDevice(id=1)] indicates the GPU was found.

The line Calculating bucket size for input with 192 tokens. I1120 03:07:01.489473 131374588219392 pipeline.py:264] Got bucket size 256 for input with 192 tokens, resulting in 64 padded tokens. indicates the bucket size for your input was 256.

The line Running model inference for seed 1 took 53.03 seconds. indicates model inference took 53.03 seconds. I think that is a reasonable amount of time for bucket size 256 on an A100.

I'll close the issue for now, but please feel free to re-open if that doesn't answer your question.