kuixu / alphafold

Install alphafold on the local machine, get out of docker.
Apache License 2.0
100 stars 33 forks source link

Unable to initialize backend 'gpu' #8

Open HGX-001 opened 3 years ago

HGX-001 commented 3 years ago

Hello,I have follow you readme to install the anaconda environment,and it can run with cpu ,but it can not run with gpu ,and in my devices have a nvidia RTX TITAN GPU with 24G momeory ,when I whatever use bash python3 run_alphafold.py --fasta_paths=T1050.fasta --max_template_date=2020-05-14# or simply exp/run_local.sh T1050.fasta it warning with I0820 16:00:20.270564 140257858221888 xla_bridge.py:212] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local:// I0820 16:00:20.281177 140257858221888 xla_bridge.py:212] Unable to initialize backend 'gpu': Not found: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host I0820 16:00:20.281787 140257858221888 xla_bridge.py:212] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available. W0820 16:00:20.282057 140257858221888 xla_bridge.py:215] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.) I feel it like run only with cpu ,and it only use 315MiB GPU Momory when I try to input nvidia-smi to look how much GPU memory use.

kuixu commented 3 years ago

Does the NVIDIA CUDA driver install successfully?

tavolivos commented 3 years ago

I have the same issue, even I can run it in CPU.

I installed nvidia-455 and cuda 11.1

HGX-001 commented 3 years ago

Yes, I am sure that I have successfully installed the NVIDIA CUDA driver, because after that I installed the alphafold2 docker version given by Deep Mind, and it can run successfully, and this warning will not be reported. Before that, I also use anaconda to run other deep learning tasks and there is no problem at all. GPUs will all be used. In order to confirm the problem, I compared the same target such as T1050. The docker version of alpahafold only takes about 3 hours, and the anaconda you gave The version is well over 3 hours.


---------- Origin message ----------

From:"Gustavol Enrique Olivos Ramirez" @.> To:"kuixu/alphafold" @.> Subject:Re: [kuixu/alphafold] Unable to initialize backend 'gpu' (#8) Date:2021-08-26 12:34:52I have the same issue, even I can run it in CPU.

I installed nvidia-455 and cuda 11.1

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/kuixu/alphafold/issues/8#issuecomment-906087549

JuergenUniVie commented 3 years ago

same problem... is it possible to set a enviroment to fix the platform "cuda" issue?

I1108 16:51:57.959832 139635007661120 xla_bridge.py:231] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I1108 16:51:57.972085 139635007661120 xla_bridge.py:231] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host I1108 16:51:57.972550 139635007661120 xla_bridge.py:231] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available. W1108 16:51:57.972653 139635007661120 xla_bridge.py:236] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:18:00.0 Off | N/A | | 23% 27C P8 10W / 250W | 15MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce ... Off | 00000000:3B:00.0 Off | N/A | | 23% 28C P8 8W / 250W | 2MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 NVIDIA GeForce ... Off | 00000000:86:00.0 Off | N/A | | 23% 30C P8 9W / 250W | 2MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 3 NVIDIA GeForce ... Off | 00000000:AF:00.0 Off | N/A | | 23% 30C P8 8W / 250W | 2MiB / 11178MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3626 G /usr/libexec/Xorg 9MiB | | 0 N/A N/A 4701 G /usr/bin/gnome-shell 3MiB | +-----------------------------------------------------------------------------+

mattiasmar commented 2 years ago

Hi guys, Any insights with regards to the cuda issue? Ping @JuergenUniVie

deejy commented 2 years ago

Trying to install alphafold on various platform (including AWS) since 2 months : I always fall on that bug which is frequent but not addressed. Very sad JPierre

coliva92 commented 2 years ago

I'm facing the same issue. Before running the run_docker.py script I ran the docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi command as mentioned in the README file and it correctly displayed my GPU (with no processes running). But I still got the same error messages others are reporting here.