Steam-Headless / docker-steam-headless

A Headless Steam Docker image supporting NVIDIA GPU and accessible via Web UI
GNU General Public License v2.0
730 stars 79 forks source link

[Bug]: Error response from daemon: unknown or invalid runtime name: nvidia #117

Open maplepy opened 6 months ago

maplepy commented 6 months ago

Describe the Bug

Error response from daemon: unknown or invalid runtime name: nvidia

Steps to Reproduce

  1. Run the docker with nvidia
  2. get error

Expected Behavior

Run the docker successfully

Screenshots

No response

Relevant Settings

Same config as #116 but with Dockerruntime set to nvidia

Version

not displayed but probably Build: [2023-12-09 02:38:11] [master] [6cc9f56] [debian]

Platform

Arch Linux 6.6.6-arch1-1 NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 Docker version 24.0.7, build afdd53b4e3 Docker Compose version 2.23.3

Relevant log output

sudo docker-compose up --force-recreate
[+] Running 0/0
 ⠋ Container steam-headless-steam-headless-1  Recreate                       0.0s 
Error response from daemon: unknown or invalid runtime name: nvidia
maplepy commented 6 months ago

sudo docker info | grep Runtime

 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: nvidia

it was previously

 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc

Note that changing this resulted in

sudo docker-compose up --force-recreate
[+] Running 1/1
 ✔ Container steam-headless-steam-headless-1  Recreated                      0.1s 
Attaching to steam-headless-1
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/moby/773e337cdf64f3b64802433e5a4657f0ec8104ee9937a47873bf04eb6d7efa8b/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown
p5-f20w18k commented 6 months ago

Not helpful I know - but I also had the same issue a few hours ago when trying to get this running, same OS, same config

dexd85 commented 6 months ago

Hi,

had a similar issue from beginning and for my was a important hint missing: You have to install and configure at the first the Nvidia Contrainer Toolkit on your host - it seems not to be installed on your machine....please follow the instructions in this Link: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

p5-f20w18k commented 6 months ago

Hi,

had a similar issue from beginning and for my was a important hint missing: You have to install and configure at the first the Nvidia Contrainer Toolkit on your host - it seems not to be installed on your machine....please follow the instructions in this Link: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

I do have it installed, tested with docker run --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3070        Off | 00000000:0E:00.0 Off |                  N/A |
|  0%   24C    P8              12W / 240W |     13MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

Edit: my error ends with 'runtime did not terminate successfully'

I think this is a docker/nvidia related issue instead of an issue with this container.

p5-f20w18k commented 6 months ago

sudo docker info | grep Runtime

 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: nvidia

it was previously

 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc

Note that changing this resulted in

sudo docker-compose up --force-recreate
[+] Running 1/1
 ✔ Container steam-headless-steam-headless-1  Recreated                      0.1s 
Attaching to steam-headless-1
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/moby/773e337cdf64f3b64802433e5a4657f0ec8104ee9937a47873bf04eb6d7efa8b/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown

See [here]https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/issues/17) for runtime errors, arch wiki was also helpful.

My compose file below, it may not be fully correct yet, dont know if its working as ive only just ran docker-compose up. Just documenting my workflow of this.

`--- services: steam-headless: image: josh5/steam-headless:latest restart: unless-stopped runtime: ${DOCKER_RUNTIME} shm_size: ${SHM_SIZE} ipc: host # Could also be set to 'shareable' ulimits: nofile: soft: 1024 hard: 524288 cap_add: