Closed mosty-gim closed 3 years ago
What version of docker and nvidia-docker are you using? If you are using nvidia-docker2, please try starting the container with the following command.
docker run -it --gpus all nvcr.io/nvidia/tensorflow:20.12-tf1-py3 nvidia-smi
hello @nluehr
What version of docker and nvidia-docker are you using? If you are using nvidia-docker2, please try starting the container with the following command.
docker run -it --gpus all nvcr.io/nvidia/tensorflow:20.12-tf1-py3 nvidia-smi
This is a docker/nvidia-docker version i use:
NVIDIA Docker: 2.6.0
Client:
Version: 20.10.2
API version: 1.41
Go version: go1.13.8
Git commit: 20.10.2-0ubuntu1~18.04.2
Built: Tue Mar 30 21:24:16 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.2
API version: 1.41 (minimum version 1.12)
Go version: go1.13.8
Git commit: 20.10.2-0ubuntu1~18.04.2
Built: Mon Mar 29 19:27:41 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.3-0ubuntu1~18.04.4
GitCommit:
runc:
Version: spec: 1.0.2-dev
GitCommit:
docker-init:
Version: 0.19.0
GitCommit:
and I get the error "No supported GPU(s) detected to run this container" as well
I replace ami to "ami-06e551da0d461d8e2" for my a100 ec2 instance This ami includes all package for cuda development such as nvidia driver, cudnn, tensorflow etc.. And finally, nvidia/tensorflow container detect gpu I think there are some of my mistakes to install packages or miss some package that i have to install...
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with: 1. TF 1.0:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
2. TF 2.0:python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
I just do like below
(1) create AWS EC2 instance ( AMI : ami-0ef85cf6e604e5650, instance type : p4d.24xlarge ) (2) install nvidia-driver ( NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.0 ) (3) install docker (4) install nvidia-docker (5) and try command like this ( I didn't use MIG )
sudo docker run -it --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=1 nvcr.io/nvidia/tensorflow:20.12-tf1-py3
I got this logs.. i am beginner for tensorflow, so i think there are some my mistake.. i don't know why tensorflow can not detect gpu.
even nvidia-smi command works well
Describe the expected behavior tensorflow detect gpu properly
Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.