Open HGX-001 opened 3 years ago
Does the NVIDIA CUDA driver install successfully?
I have the same issue, even I can run it in CPU.
I installed nvidia-455 and cuda 11.1
Yes, I am sure that I have successfully installed the NVIDIA CUDA driver, because after that I installed the alphafold2 docker version given by Deep Mind, and it can run successfully, and this warning will not be reported. Before that, I also use anaconda to run other deep learning tasks and there is no problem at all. GPUs will all be used. In order to confirm the problem, I compared the same target such as T1050. The docker version of alpahafold only takes about 3 hours, and the anaconda you gave The version is well over 3 hours.
---------- Origin message ----------
From:"Gustavol Enrique Olivos Ramirez" @.> To:"kuixu/alphafold" @.> Subject:Re: [kuixu/alphafold] Unable to initialize backend 'gpu' (#8) Date:2021-08-26 12:34:52I have the same issue, even I can run it in CPU.
I installed nvidia-455 and cuda 11.1
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/kuixu/alphafold/issues/8#issuecomment-906087549
same problem... is it possible to set a enviroment to fix the platform "cuda" issue?
I1108 16:51:57.959832 139635007661120 xla_bridge.py:231] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I1108 16:51:57.972085 139635007661120 xla_bridge.py:231] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host I1108 16:51:57.972550 139635007661120 xla_bridge.py:231] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available. W1108 16:51:57.972653 139635007661120 xla_bridge.py:236] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:18:00.0 Off | N/A | | 23% 27C P8 10W / 250W | 15MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce ... Off | 00000000:3B:00.0 Off | N/A | | 23% 28C P8 8W / 250W | 2MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 NVIDIA GeForce ... Off | 00000000:86:00.0 Off | N/A | | 23% 30C P8 9W / 250W | 2MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 3 NVIDIA GeForce ... Off | 00000000:AF:00.0 Off | N/A | | 23% 30C P8 8W / 250W | 2MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3626 G /usr/libexec/Xorg 9MiB | | 0 N/A N/A 4701 G /usr/bin/gnome-shell 3MiB | +-----------------------------------------------------------------------------+
Hi guys, Any insights with regards to the cuda issue? Ping @JuergenUniVie
Trying to install alphafold on various platform (including AWS) since 2 months : I always fall on that bug which is frequent but not addressed. Very sad JPierre
I'm facing the same issue. Before running the run_docker.py
script I ran the docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
command as mentioned in the README
file and it correctly displayed my GPU (with no processes running). But I still got the same error messages others are reporting here.
Hello,I have follow you readme to install the anaconda environment,and it can run with cpu ,but it can not run with gpu ,and in my devices have a nvidia RTX TITAN GPU with 24G momeory ,when I whatever use
bash python3 run_alphafold.py --fasta_paths=T1050.fasta --max_template_date=2020-05-14# or simply exp/run_local.sh T1050.fasta
it warning with I0820 16:00:20.270564 140257858221888 xla_bridge.py:212] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local:// I0820 16:00:20.281177 140257858221888 xla_bridge.py:212] Unable to initialize backend 'gpu': Not found: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host I0820 16:00:20.281787 140257858221888 xla_bridge.py:212] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available. W0820 16:00:20.282057 140257858221888 xla_bridge.py:215] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.) I feel it like run only with cpu ,and it only use 315MiB GPU Momory when I try to input nvidia-smi to look how much GPU memory use.