Steam-Headless / docker-steam-headless

A Headless Steam Docker image supporting NVIDIA GPU and accessible via Web UI
GNU General Public License v2.0
730 stars 79 forks source link

[Bug]: udev (exit status 1; not expected) #111

Open bpvarsity opened 7 months ago

bpvarsity commented 7 months ago

Describe the Bug

Tried the latest debian image and one from 2 months ago. Using template docker compose and .env and only change was runtime to nvidia.

I can view the web ui login paid but get a failed to connect to server. It seems to be stuck in a loop with

2023-11-29 04:25:52,784 INFO success: x11vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2023-11-29 04:25:52,784 INFO success: desktop entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2023-11-29 04:25:52,784 INFO reaped unknown pid 962 (exit status 0) 2023-11-29 04:25:54,547 WARN exited: xorg (exit status 11; not expected) 2023-11-29 04:25:54,548 INFO spawned: 'xorg' with pid 989 2023-11-29 04:25:55,550 INFO success: xorg entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

Steps to Reproduce

No response

Expected Behavior

No response

Screenshots

No response

Relevant Settings

No response

Version

Build: [2023-09-29 21:48:12] [0.2.0] [f4cad0aab007ca30d399dd44b79a173872dd3f07] [debian]

Platform

Debian GNU/Linux - 12 (bookworm) 6.1.0-13-amd64 unknown unknown GNU/Linux | NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0 | Docker version 20.10.24+dfsg1, build 297e128

Relevant log output

Installing NVIDIA driver v525.125.06 to match what is running on the host
Leaving NVIDIA driver stock without patching
DONE

[ /etc/cont-init.d/70-configure_desktop.sh: executing... ]
**** Configure Desktop ****
Enable Desktop service.
Ensure home directory template is owned by the default user.
Installing default home directory template
DONE

[ /etc/cont-init.d/70-configure_xorg.sh: executing... ]
**** Generate NVIDIA xorg.conf ****
Configure Xwrapper.config
Configure container as primary the X server
Enabling evdev input class on pointers, keyboards, touchpads, touch screens, etc.
'/usr/share/X11/xorg.conf.d/10-evdev.conf' -> '/etc/X11/xorg.conf.d/10-evdev.conf'
Configuring X11 with GPU ID: 'GPU-16b8a046-286c-2333-ba53-a579c0bede7b'
Configuring X11 with PCI bus ID: 'PCI:45:0:0'
Writing X11 config with Modeline "1600x900R"  201.00  1600 1648 1680 1760  900 903 908 953 +hsync -vsync

WARNING: Unable to locate/open X configuration file.

Option "ProbeAllGpus" "False" added to Screen "Screen0".
Option "BaseMosaic" "False" added to Screen "Screen0".
Option "AllowEmptyInitialConfiguration" "True" added to Screen "Screen0".
New X configuration file written to '/etc/X11/xorg.conf'

DONE

[ /etc/cont-init.d/80-configure_flatpak.sh: executing... ]
**** Configure Flatpak ****
Flatpak configured for running inside a Docker container
DONE

[ /etc/cont-init.d/90-configure_neko.sh: executing... ]
**** Configure Neko ****
Disable Neko server
DONE

[ /etc/cont-init.d/90-configure_steam.sh: executing... ]
**** Configure Steam ****
Enable Steam auto-start script
DONE

[ /etc/cont-init.d/90-configure_sunshine.sh: executing... ]
**** Configure Sunshine ****
Disable Sunshine server
DONE

[ /etc/cont-init.d/90-configure_vnc.sh: executing... ]
**** Configure VNC ****
Configure VNC service port '32036'
Configure pulseaudio encoded stream port '32037'
Enable VNC server
DONE

[ /etc/cont-init.d/95-setup_wol.sh: executing... ]
**** Configure WoL Manager ****
Disable WoL Manager service.

**** Starting supervisord ****
Logging all root services to '/var/log/supervisor/'
Logging all user services to '/home/default/.cache/log/'

2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/dbus.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/desktop.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/neko.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/pulseaudio.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/steam.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/sunshine.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/udev.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/vnc-audio.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/vnc.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/wol-power-manager.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/xorg.ini" during parsing
2023-11-29 04:21:54,687 INFO Included extra file "/etc/supervisor.d/xvfb.ini" during parsing
2023-11-29 04:21:54,687 INFO Set uid to user 0 succeeded
2023-11-29 04:21:54,689 INFO RPC interface 'supervisor' initialized
2023-11-29 04:21:54,689 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2023-11-29 04:21:54,689 INFO supervisord started with pid 1
2023-11-29 04:21:55,692 INFO spawned: 'dbus' with pid 335
2023-11-29 04:21:55,692 INFO spawned: 'udev' with pid 336
2023-11-29 04:21:55,693 INFO spawned: 'xorg' with pid 337
2023-11-29 04:21:55,694 INFO spawned: 'audiostream' with pid 338
2023-11-29 04:21:55,694 INFO spawned: 'frontend' with pid 339
2023-11-29 04:21:55,695 INFO spawned: 'pulseaudio' with pid 341
2023-11-29 04:21:55,696 INFO spawned: 'x11vnc' with pid 343
2023-11-29 04:21:55,696 INFO spawned: 'desktop' with pid 345
PULSEAUDIO: Starting pulseaudio service
2023-11-29 04:21:55,704 INFO reaped unknown pid 363 (exit status 0)
2023-11-29 04:21:55,741 WARN exited: udev (exit status 1; not expected)
2023-11-29 04:21:56,715 INFO success: dbus entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,715 INFO success: xorg entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,715 INFO success: audiostream entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,715 INFO success: frontend entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,715 INFO success: pulseaudio entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,715 INFO success: x11vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,715 INFO success: desktop entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:21:56,776 INFO spawned: 'udev' with pid 394
2023-11-29 04:21:56,822 WARN exited: udev (exit status 1; not expected)
2023-11-29 04:25:51,777 WARN exited: desktop (exit status 11; not expected)
2023-11-29 04:25:51,777 INFO spawned: 'x11vnc' with pid 953
2023-11-29 04:25:51,778 INFO spawned: 'desktop' with pid 954
2023-11-29 04:25:52,784 INFO success: x11vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:25:52,784 INFO success: desktop entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-11-29 04:25:52,784 INFO reaped unknown pid 962 (exit status 0)
2023-11-29 04:25:54,547 WARN exited: xorg (exit status 11; not expected)
2023-11-29 04:25:54,548 INFO spawned: 'xorg' with pid 989
2023-11-29 04:25:55,550 INFO success: xorg entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
alansari commented 7 months ago

How are you running the container? should be run with host networking is my first guess. But if you are using the template without any changes that should already be set, so my next guess is you might be using a vpn, tailscale, traefik, or some other reverse proxy setup. If you need help with this jump on discord!

apclapp commented 5 months ago

@bpvarsity -- In case you were still having this issue, I found your post after seeing an identical pattern of error messages.

My environment is slightly different, running under Docker on Unraid.

I found that in my situation, I was using an older set of launch parameters. Reviewing and updating these solved my issue. (For Unraid users, these are found by clicking 'edit' on a Docker container, then toggling the 'Advanced' slide-toggle in the top-right of the UI, and then edit the textbox labeled "Extra Parameters".

My old parameters, which resulted in errors were as follows (reformatted for ease of reading):

--hostname='SteamHeadless' 
--restart=unless-stopped 
--runtime=nvidia 
--add-host='SteamHeadless:127.0.0.1' 
--shm-size=2G 
--ipc="host" 
--ulimit nofile=1024:524288
-v '/tmp/.X11-unix':'/tmp/.X11-unix':'rw' 
-v '/tmp/tmp/pulse':'/tmp/tmp/pulse':'rw' 
-v '/dev/input':'/dev/input':'ro' 

The updated launch parameters, which got things working correctly, were as follows:

--hostname='SteamHeadless' 
--restart=unless-stopped 
--runtime=nvidia 
--add-host='SteamHeadless:127.0.0.1' 
--shm-size='2G' 
--ulimit='nofile=1024:524288' 
--device='/dev/fuse' 
--device='/dev/uinput' 
--device-cgroup-rule='c 13:* rmw' 
--cap-add='NET_ADMIN' 
--cap-add='SYS_ADMIN' 
--cap-add='SYS_NICE' 
--security-opt='seccomp=unconfined'
-v '/tmp/.X11-unix/':'/tmp/.X11-unix/':'rw' 
-v '/tmp/tmp/pulse/':'/tmp/tmp/pulse/':'rw' 
-v '/dev/input/':'/dev/input/':'ro' 
-v '/run/udev/data/':'/run/udev/data/':'ro' 

Hopefully this is helpful for your situation as well. Also, do note that I'm using the --runtime=nvidia parameter because I am in an nvidia environment. Of course you can drop/alter this as it applies to you.

trgkyle commented 4 months ago

@bpvarsity -- In case you were still having this issue, I found your post after seeing an identical pattern of error messages.

My environment is slightly different, running under Docker on Unraid.

I found that in my situation, I was using an older set of launch parameters. Reviewing and updating these solved my issue. (For Unraid users, these are found by clicking 'edit' on a Docker container, then toggling the 'Advanced' slide-toggle in the top-right of the UI, and then edit the textbox labeled "Extra Parameters".

My old parameters, which resulted in errors were as follows (reformatted for ease of reading):

--hostname='SteamHeadless' 
--restart=unless-stopped 
--runtime=nvidia 
--add-host='SteamHeadless:127.0.0.1' 
--shm-size=2G 
--ipc="host" 
--ulimit nofile=1024:524288
-v '/tmp/.X11-unix':'/tmp/.X11-unix':'rw' 
-v '/tmp/tmp/pulse':'/tmp/tmp/pulse':'rw' 
-v '/dev/input':'/dev/input':'ro' 

The updated launch parameters, which got things working correctly, were as follows:

--hostname='SteamHeadless' 
--restart=unless-stopped 
--runtime=nvidia 
--add-host='SteamHeadless:127.0.0.1' 
--shm-size='2G' 
--ulimit='nofile=1024:524288' 
--device='/dev/fuse' 
--device='/dev/uinput' 
--device-cgroup-rule='c 13:* rmw' 
--cap-add='NET_ADMIN' 
--cap-add='SYS_ADMIN' 
--cap-add='SYS_NICE' 
--security-opt='seccomp=unconfined'
-v '/tmp/.X11-unix/':'/tmp/.X11-unix/':'rw' 
-v '/tmp/tmp/pulse/':'/tmp/tmp/pulse/':'rw' 
-v '/dev/input/':'/dev/input/':'ro' 
-v '/run/udev/data/':'/run/udev/data/':'ro' 

Hopefully this is helpful for your situation as well. Also, do note that I'm using the --runtime=nvidia parameter because I am in an nvidia environment. Of course you can drop/alter this as it applies to you.

Thanks, I got same problem and its seem this problem was fixed by follow your config but I got another error

Build: [2024-02-10 02:35:31] [master] [6cc9f56155f3c7f9fc6bc9c22ef2cbf555029c00] [debian]

[ /etc/cont-init.d/10-setup_user.sh: executing... ] Configure default user

[ /etc/cont-init.d/11-setup_sysctl_values.sh: executing... ] Configure some system kernel parameters

[ /etc/cont-init.d/30-configure_dbus.sh: executing... ] Configure container dbus

[ /etc/cont-init.d/30-configure_udev.sh: executing... ] Configure udevd

[ /etc/cont-init.d/40-setup_locale.sh: executing... ] Configure local

[ /etc/cont-init.d/50-configure_pulseaudio.sh: executing... ] Configure pulseaudio

[ /etc/cont-init.d/60-configure_gpu_driver.sh: executing... ] Found Intel device 'Intel(R) Core(TM) i9-14900K'

[ /etc/cont-init.d/70-configure_desktop.sh: executing... ] Configure Desktop

[ /etc/cont-init.d/70-configure_xorg.sh: executing... ] Generate NVIDIA xorg.conf

WARNING: Unable to locate/open X configuration file.

Option "ProbeAllGpus" "False" added to Screen "Screen0". Option "BaseMosaic" "False" added to Screen "Screen0". Option "AllowEmptyInitialConfiguration" "True" added to Screen "Screen0". New X configuration file written to '/etc/X11/xorg.conf'

DONE

[ /etc/cont-init.d/80-configure_flatpak.sh: executing... ] Configure Flatpak mount: /proc: cannot mount none read-only. dmesg(1) may have more information after failed mount system call.

alansari commented 4 months ago

you might want to try after removing these lines: -v '/dev/input/':'/dev/input/':'ro' -v '/run/udev/data/':'/run/udev/data/':'ro' I believe you also don't need: -v '/tmp/.X11-unix/':'/tmp/.X11-unix/':'rw' -v '/tmp/tmp/pulse/':'/tmp/tmp/pulse/':'rw' unless you are sharing the X11 or pulse socket with other containers.

Though my best guess is your running on ubuntu/debian host and you need: security_opt: