Open ffteen opened 5 years ago
Hello @ffteen thank you for the report
any progress on this issue?
I think one hacky way, though not very reliable, is to use the low-level Api and overwrite the host configuration. Since I only tried to follow the docker cli code in go, I'm not sure how reliable/portable this solution is. It works on my machine and I thought it might help someone until the official support is implemented.
The following code is a modification of the original DockerClient.containers.create()
function, that adds a DeviceRequest
to the host configuration and otherwise works exactly like the original function:
import docker
from docker.models.images import Image
from docker.models.containers import _create_container_args
def create_with_device_request(client, image, command, device_request=None, **kwargs):
if isinstance(image, Image):
image = image.id
kwargs['image'] = image
kwargs['command'] = command
kwargs['version'] = client.containers.client.api._version
create_kwargs = _create_container_args(kwargs)
# modification to the original create function
if device_request is not None:
create_kwargs['host_config']['DeviceRequests'] = [device_request]
# end modification
resp = client.api.create_container(**create_kwargs)
return client.containers.get(resp['Id'])
# Example usage
device_request = {
'Driver': 'nvidia',
'Capabilities': [['gpu'], ['nvidia'], ['compute'], ['compat32'], ['graphics'], ['utility'], ['video'], ['display']], # not sure which capabilities are really needed
'Count': -1, # enable all gpus
}
container = create_with_device_request(docker.from_env(), 'nvidia/cuda:9.0-base', 'nvidia-smi', device_request, ...)
I think the cli client sets the NVIDIA_VISIBLE_DEVICES
environment variable, so it's probably a good idea to do the same with environment={'NVIDIA_VISIBLE_DEVICES': 'all'}
as parameter of the create_with_device_request() call.
This enables all available gpus. You could modify this with different device_requests:
# enable two gpus
device_request = {
'Driver': 'nvidia',
'Capabilities': ...,
'Count': 2, # enable two gpus
}
# enable gpus with id or uuid
device_request = {
'Driver': 'nvidia',
'Capabilities': ...,
'DeviceIDs': ['0', 'GPU-abcedfgh-1234-a1b2-3c4d-a7f3ovs13da1'] # enable gpus with id 0 and uuid
}
The environment parameter should then look like {'NVIDIA_VISIBLE_DEVICES': '0,1'}
respectively {'NVIDIA_VISIBLE_DEVICES': '0,GPU-xxx'}
I‘m not sure which capabilities are really needed too!
Does create_service support device request param?
I use nvidia runtime instead.
As far as I can tell, services.create()
does not support device requests.
Setting runtime='nvidia'
is definitely the better approach, if possible.
The problem I had was, that I use the nvidia-container-toolkit which does not require to install the nvidia-runtime, so setting nvidia runtime leads to Error: unknown runtime specified nvidia
, while using --gpus=all
works as expected.
Is there a better way to use nvidia-gpus with the nvidia-container-toolkit?
I have a change (that appears to work) that allows the "gpus" option in my fork. I'd like to create a PR for it, but when running the tests, this error (which is unrelated to the change) occurs:
tests/integration/api_service_test.py:379:53: F821 undefined name 'BUSYBOX' Makefile:92: recipe for target 'flake8' failed
Is there a package that needs to be installed to fix this?
@hnine999 No, that's an error on our end - we'll fix it shortly. Feel free to submit your PR in the meantime!
The PR from @hnine999 is #2419
Hi - Any update with this feature?
Any update on this? It is badly needed. docker-py is functionally broken for running GPU enabled containers.
+1
this is actually a major feature for all data science community that runs tensorflow in docker on nvidia GPUs in the cloud. Why is this ignored for such a long time? 😞
Any update on this?
Still waiting for this to be supported... The only workaround for now is "docker run" with bash :(
On Thu, Mar 12, 2020, 02:11 bluebox42 notifications@github.com wrote:
Any update on this?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/docker/docker-py/issues/2395#issuecomment-598081853, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHPL6ATC2GVI56J6F4IQ5DRHCRNRANCNFSM4IIP5BUA .
Still waiting for this to be supported... The only workaround for now is "docker run" with bash :(
At the moment, nvidia-container-toolkit
still includes nvidia-container-runtime
. So, you can still add nvidia-container-runtime
as a runtime in /etc/docker/daemon.json
:
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Then restart the docker service (sudo systemctl restart docker
) and use runtime="nvidia"
in docker-py as before.
Thanks a bunch - that works BUT the daemon.json is missing a double quote in runtimes: { "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "runtimeArgs": [] } } }
Is there a solid fix for this issue?
Thanks - updated my comment with that suggestion
Hi @jmsmkn I installed nvidia-container-toolkit
in arch, but it does not come with nvidia-container-runtime
. Any update with this? Thanks.
cd /usr/bin
ls | grep nvidia
nvidia-bug-report.sh
nvidia-container-cli
nvidia-container-runtime-hook
nvidia-container-toolkit
nvidia-cuda-mps-control
nvidia-cuda-mps-server
nvidia-debugdump
nvidia-modprobe
nvidia-persistenced
nvidia-settings
nvidia-sleep.sh
nvidia-smi
nvidia-xconfig
@vwxyzjn
I think this will help
Simple "gpus=" keyword parameter, please !
Need this feature supported badly for lots people who are dealing data with GPU for AI and HPC. Please add this feature as soon as you guys can, we'll be very grateful.
Is this issue on some agenda? (This is your second most upvoted open issue at the moment.)
Hi all, I made a Python client for Docker that sits on top of the Docker client binary (the one written in go). It took me several months of work. It notably has support for gpus in docker.run(...)
and docker.container.create(...)
, with all options that the CLI has.
It's currently only available for my sponsors, but It'll be open source with an MIT licence May 1st, 2021 🙂
Hi all, in the end, making Python-on-whales pay-to-use wasn't a success. So I've open-sourced it.
It's free and on Pypi now. Have fun 😃
$ pip install python-on-whales
$ python
>>> from python_on_whales import docker
>>> print(docker.run("nvidia/cuda:11.0-base", ["nvidia-smi"], gpus="all"))
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
looks good!
On Wed, Dec 2, 2020 at 2:30 PM Gabriel de Marmiesse < notifications@github.com> wrote:
Hi all, in the end, making Python-on-whales pay-to-use wasn't a success. So I've open-sourced it.
It's free and on Pypi now. Have fun 😃
$ pip install python-on-whales
$ python
from python_on_whales import docker print(docker.run("nvidia/cuda:11.0-base", ["nvidia-smi"], gpus="all")) +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. |
|===============================+======================+======================| | 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 | | N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage |
|=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
https://github.com/gabrieldemarmiesse/python-on-whales
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/docker/docker-py/issues/2395#issuecomment-737446886, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHPL6CZROSUA3THJNPFA3TSS2IVHANCNFSM4IIP5BUA .
In the end, I have just written a very simple wrapper around subprocess.run, with a built arg_list that can include the required GPU parameter, that captures stdout and stderr and the return code, and the execution duration.
Incidentally I have found that the AWS ML AMI works well with Docker/nVidia, with no further tricky configuration required. All I would say is to fire up an instance using the AMI, do the required apt update/upgrades, then freeze /that/ as your AMI to use; it avoids a 5-minute delay ! For my purposes, a root volume of 200GB works fine, as opposed to the vast default root volumes you get with the g3/g4 instances (maybe required if you are going to hibernate). But am going a bit off-topic !
Hello team, is this a feature that you are thinking of adding? It would be of great value
@JoanFM I guess, this functionality has already been implemented:
client.containers.run(
'nvidia/cuda:9.0-base',
'nvidia-smi',
device_requests=[
docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
]
)
Not very elegant, but it works
@matyushinleonid Thanks heaps! it worked
@JoanFM I guess, this functionality has already been implemented:
client.containers.run( 'nvidia/cuda:9.0-base', 'nvidia-smi', device_requests=[ docker.types.DeviceRequest(count=-1, capabilities=[['gpu']]) ] )
Not very elegant, but it works
This works! This is the only solution that actually works, thanks so much! :)
no luck with device_requests.
import docker
import os
os.environ['DOCKER_HOST'] = "unix:///run/user/1000/podman/podman.sock"
#os.environ['DOCKER_HOST'] = "unix:///run/podman/podman.sock"
client = docker.from_env()
logs = client.containers.run('nvidia/cuda:12.2.0-devel-ubuntu20.04',
"nvidia-smi",
device_requests=[docker.types.DeviceRequest(count=-1,capabilities=[['gpu']])])
Traceback (most recent call last):
File "/home/user/podman_gpu.py", line 7, in <module>
logs = client.containers.run('nvidia/cuda:12.2.0-devel-ubuntu20.04',
File "/usr/local/lib/python3.9/site-packages/docker/models/containers.py", line 887, in run
raise ContainerError(
docker.errors.ContainerError: Command 'nvidia-smi' in image 'nvidia/cuda:12.2.0-devel-ubuntu20.04' returned non-zero exit status 127: b'/opt/nvidia/nvidia_entrypoint.sh: line 67: exec: nvidia-smi: not found\n'
podman version: 4.4.1 host OS: RHEL 9.2 docker py version: 6.1.3
podman CLI able to access GPU
[user@rh91-bay7 ~]$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:12.2.0-devel-ubuntu20.04 nvidia-smi -L
==========
== CUDA ==
==========
CUDA Version 12.2.0
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
GPU 0: Tesla T4 (UUID: GPU-f7c1d1ba-7a85-537a-65ae-462ce7d7eca8)
[user@rh91-bay7 ~]$
@JoanFM我猜想这个功能已经实现了:
client.containers.run( 'nvidia/cuda:9.0-base', 'nvidia-smi', device_requests=[ docker.types.DeviceRequest(count=-1, capabilities=[['gpu']]) ] )
不是很优雅,但是可以工作
这有效!这是唯一真正有效的解决方案,非常感谢!:)
nice for true
docker version: 19.03 I want to set
--gpus all
when create container ,but found docker-py not support this param.