Deepomatic / dmake

DMake is a tool to manage micro-service based applications
MIT License
36 stars 3 forks source link

Compatibility issue with latest nvidia-docker #442

Open vdel opened 4 years ago

vdel commented 4 years ago

Using dmake with latest nvidia-docker produces this error, for exemple when entering a dmake shell:

## Deploying ##
- Running shell @ thoth/dev
docker: Error response from daemon: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/28333ed8868ae80d6d436ef55f0e0d9387426ccf99dcfc08c701d9826ccc0286/log.json: no such file or directory): exec: "nvidia-container-runtime": executable file not found in $PATH: : unknown.

This is likely due to the fact that one should not use the --runtime=nvidia anymore, but --gpus all.

Changing this line to replace --runtime=nvidia by --gpus all should fix the issue: https://github.com/Deepomatic/dmake/blob/master/dmake/utils/dmake_run_docker#L52

To dev, one can try to do dmake shell dev in Thoth with the latest nvidia-docker.

thomas-riccardi commented 4 years ago

For now let's keep using nvidia-docker2 instead of just its nvidia-container-toolkit dependency as now documented in https://github.com/NVIDIA/nvidia-docker#ubuntu-16041804-debian-jessiestretchbuster

thomas-riccardi commented 4 years ago

Changing this line to replace --runtime=nvidia by --gpus all should fix the issue: https://github.com/Deepomatic/dmake/blob/master/dmake/utils/dmake_run_docker#L52

This would break the DMAKE_GPU feature, but it can still be supported with the new --gpus syntax.

Also, for reference, the new docker run --gpus syntax is supported only since docker 19.03.0.