Open alexfrolov opened 4 months ago
Hi!
Some more information on that issue:
Base OS: Ubuntu 22.04.4 LTS
Kernel version: 5.15.0-118-generic
Drivers: 555.42.06, 560.28.03
Docker image: nvcr.io/nvidia/cuda:12.5.1-devel-ubuntu22.04
Server: Docker Engine - Community
Engine:
Version: 27.1.1
API version: 1.46 (minimum version 1.24)
Go version: go1.21.12
Git commit: cc13f95
Built: Tue Jul 23 19:57:01 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.7.19
GitCommit: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
runc:
Version: 1.7.19
GitCommit: v1.1.13-0-g58aa920
docker-init:
Version: 0.19.0
We're looking to add support for this in a future release.
Hi!
I have been testing
cuda-checkpoint
for CR inside docker containers and found out that when not all nvidia devices are used in the containercuda-checkpoint
fails to restore application, while it works perfectly well when all--gpus all
is specified. It seems that the driver does not support this scenario. Is it possible to add this functionality?Best, Alex