NVIDIA / nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs
Apache License 2.0
17.27k stars 2.03k forks source link

Create nvidia driver volume seperately #112

Closed ArtanisCV closed 7 years ago

ArtanisCV commented 8 years ago

Currently, when nvidia-docker run is executed but the nvidia driver volume does not exist, nvidia-docker will pass the argument --volume-driver=nvidia-docker to the internal docker run (see func volumesArgs in https://github.com/NVIDIA/nvidia-docker/blob/master/tools/src/nvidia-docker/local.go). As a result, we cannot specify any other volume at the same time.

For example, if we execute nvidia-docker run -v my_volume:/home my_image, and the driver volume nvidia_driver_xxx has not been created, nvidia-docker will expand the above command like: docker run --volume-driver=nvidia-docker -v nvidia_driver_xxx:/usr/local/nvidia:ro -v my_volume:/home my_image Due to the added --volume-driver, the user-specified volume my_volume:/home will receive a "bad volume format" error.

Thus, will it be better to create nvidia driver volume seperately, i.e., expand nvidia-docker run to docker volume create + docker run if the driver volume doesn't exist?

3XX0 commented 8 years ago

Yes this is an issue with named volumes. There have been several discussions on the Docker issue tracker (e.g https://github.com/docker/docker/issues/16069) and they recommend creating the volume manually first.

I'm not sure we want to hide the volume creation behind the run command. On the other hand, having users do docker volume create -d nvidia-docker --name nvidia_driver_XXX.XX is not ideal either

loretoparisi commented 7 years ago

@3XX0 I do not get the point here. I have the volume nvidia_driver_384.66 already created by a previous docker-compose command, so now if I try

docker run -it --volume-driver=efs -v fs-xxxxxxx:/root --device=/dev/nvidiactl --device=/dev/nvidia-uvm --device=/dev/nvidia0 -v nvidia_driver_384.66:/usr/local/nvidia:ro --name darknet_gpu loretoparisi/darknet_gpu:1.0.1 bash
docker: Error response from daemon: create nvidia_driver_384.66: driver 'nvidia-docker' already has volume 'nvidia_driver_384.66': volume name must be unique.

I get the error. Of course I cannot remove that volume since it's in use:

nvidia-docker volume rm nvidia_driver_384.66
Error response from daemon: Unable to remove volume, volume still in use: remove nvidia_driver_384.66: volume is in use

So how to mount the nvidia volume driver to the new container?

3XX0 commented 7 years ago

@loretoparisi I think it's because your volume has been registered by the nvidia-docker driver. IIRC if you create it manually with the command above docker takes a sightly different path which should work.

3XX0 commented 7 years ago

Closing, there is no more volume in 2.0/master