Closed sfxworks closed 1 year ago
I've been getting the same issue on a GTX 1050 Ti.
Xorg fails right on startup and keeps trying to start indefinitely. Looking at its log, we can deduce it's a problem with the NVIDIA configuration.
Just for reference, This works just fine with a VM and runc, but I don't think steam would use the gpu since it's missing the drivers...
I installed the nvidia-container-rutime https://aur.archlinux.org/packages/nvidia-container-runtime and tested that, same issue occurs.
I just did an update on the old machine. I can not start X on the old graphics card anymore with this. Even with runc
Hmm, Nvidia had some driver updates https://www.nvidia.com/Download/Find.aspx?lang=en-us
I can confirm that rolling back my driver to 530.41.03 works
Build: [2023-06-27 03:02:36] [master] [4896a5bc36f87548bb058bd9b1ed7dd430256a15]
[ /etc/cont-init.d/10-setup_user.sh: executing... ]
**** Configure default user ****
Setting default user uid=1000(default) gid=1000(default)
usermod: no changes
Adding default user to video, audio, input and pulse groups
Adding default user to any additional required device groups
Adding user 'default' to group: 'user-gid-994' for device: /dev/input/event0
Adding user 'default' to group: 'user-gid-985' for device: /dev/dri/card0
Adding user 'default' to group: 'user-gid-989' for device: /dev/dri/renderD128
Setting umask to 000
Create the user XDG_RUNTIME_DIR path '/tmp/.X11-unix/run'
Adding default home directory template
Setting ownership of all log files in '/home/default/.cache/log'
Setting root password
Setting user password
DONE
[ /etc/cont-init.d/20-configre_sshd.sh: executing... ]
**** Configure SSH server ****
Disable SSH server
DONE
[ /etc/cont-init.d/30-configure_dbus.sh: executing... ]
**** Configure container dbus ****
Container configured to run its own dbus
DONE
[ /etc/cont-init.d/30-configure_udev.sh: executing... ]
**** Configure container to run udev management ****
**** Ensure the default user has permission to r/w on input devices ****
DONE
[ /etc/cont-init.d/40-setup_locale.sh: executing... ]
**** Locales already set correctly to en_US.UTF-8 UTF-8 ****
DONE
[ /etc/cont-init.d/50-configure_pulseaudio.sh: executing... ]
**** Configure pulseaudio ****
Configure pulseaudio to pipe audio to a socket
DONE
[ /etc/cont-init.d/60-configure_gpu_driver.sh: executing... ]
**** Found NVIDIA device 'NVIDIA GeForce GTX 1070' ****
Downloading driver v530.41.03
/tmp/NVIDIA.run 100%[===================>] 328.36M 23.7MB/s in 11s
Installing NVIDIA driver v530.41.03 to match what is running on the host
**** No Intel device found ****
**** No Intel device found ****
**** No AMD device found ****
DONE
[ /etc/cont-init.d/70-configure_xorg.sh: executing... ]
**** Generate NVIDIA xorg.conf ****
Configure Xwrapper.config
Configure container as primary the X server
Enabling evdev input class on pointers, keyboards, touchpads, touch screens, etc.
'/usr/share/X11/xorg.conf.d/10-evdev.conf' -> '/etc/X11/xorg.conf.d/10-evdev.conf'
cat: '/sys/class/drm/card*/status': No such file or directory
No monitors connected. Installing dummy xorg.conf
'/templates/xorg/xorg.dummy.conf' -> '/etc/X11/xorg.conf'
Configuring X11 with GPU ID: 'GPU-e529b6c9-77a5-1417-52e4-2a99a613103c'
Configuring X11 with PCI bus ID: 'PCI:36:0:0'
Writing X11 config with Modeline "1600x900R" 97.50 1600 1648 1680 1760 900 903 908 926 +hsync -vsync
WARNING: No Layout specified, constructing implicit layout section using screen "Default Screen".
WARNING: Unable to find CorePointer in X configuration; attempting to add new CorePointer section.
WARNING: The CorePointer device was not specified explicitly in the layout; using the first mouse device.
WARNING: Unable to find CoreKeyboard in X configuration; attempting to add new CoreKeyboard section.
WARNING: The CoreKeyboard device was not specified explicitly in the layout; using the first keyboard device.
Using X configuration file: "/etc/X11/xorg.conf".
Option "ProbeAllGpus" "False" added to Screen "Default Screen".
Option "AllowEmptyInitialConfiguration" "True" added to Screen "Default Screen".
Backed up file '/etc/X11/xorg.conf' as '/etc/X11/xorg.conf.nvidia-xconfig-original'
Backed up file '/etc/X11/xorg.conf' as '/etc/X11/xorg.conf.backup'
New X configuration file written to '/etc/X11/xorg.conf'
DONE
[ /etc/cont-init.d/80-configure-dind.sh: executing... ]
**** Configure Dockerd ****
Enable Dockerd daemon
Add user 'default' to docker group for sudoless execution
DONE
[ /etc/cont-init.d/90-configure_neko.sh: executing... ]
**** Configure Neko ****
Disable Neko server
DONE
[ /etc/cont-init.d/90-configure_sunshine.sh: executing... ]
**** Configure Sunshine ****
Enable Sunshine server
DONE
[ /etc/cont-init.d/90-configure_vnc.sh: executing... ]
**** Configure VNC ****
Configure VNC service port '32036'
Configure noVNC service port '32037'
Configure audio websocket port '32038'
Configure pulseaudio encoded stream port '32039'
Enable VNC server
Patching noVNC with audio websocket
DONE
[ /etc/cont-init.d/95-configure_secondary.sh: executing... ]
DONE
**** Starting supervisord ****
Logging all root services to '/var/log/supervisor/'
Logging all user services to '/home/default/.cache/log/'
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/dbus.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/desktop.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/dind.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/neko.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/pulseaudio.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/sshd.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/steam.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/sunshine.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/udev.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/vnc-audio.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/vnc.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/xorg.ini" during parsing
2023-06-27 02:51:08,766 INFO Included extra file "/etc/supervisor.d/xvfb.ini" during parsing
2023-06-27 02:51:08,766 INFO Set uid to user 0 succeeded
2023-06-27 02:51:08,768 INFO RPC interface 'supervisor' initialized
2023-06-27 02:51:08,768 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2023-06-27 02:51:08,769 INFO supervisord started with pid 1
2023-06-27 02:51:09,771 INFO spawned: 'dbus' with pid 376
2023-06-27 02:51:09,772 INFO spawned: 'udev' with pid 377
2023-06-27 02:51:09,772 INFO spawned: 'dind' with pid 378
2023-06-27 02:51:09,773 INFO spawned: 'xorg' with pid 379
2023-06-27 02:51:09,774 INFO spawned: 'audiostream' with pid 380
2023-06-27 02:51:09,775 INFO spawned: 'novnc' with pid 381
2023-06-27 02:51:09,776 INFO spawned: 'pulseaudio' with pid 382
2023-06-27 02:51:09,777 INFO spawned: 'vncproxy' with pid 383
2023-06-27 02:51:09,779 INFO spawned: 'x11vnc' with pid 385
2023-06-27 02:51:09,780 INFO spawned: 'audiowebsock' with pid 387
2023-06-27 02:51:09,781 INFO spawned: 'desktop' with pid 388
2023-06-27 02:51:09,782 INFO spawned: 'sunshine' with pid 391
PULSEAUDIO: Starting pulseaudio service
2023-06-27 02:51:09,782 INFO success: vncproxy entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-06-27 02:51:09,815 INFO reaped unknown pid 423 (exit status 0)
2023-06-27 02:51:10,803 INFO success: dbus entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,803 INFO success: udev entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,803 INFO success: dind entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,803 INFO success: xorg entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,803 INFO success: audiostream entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,804 INFO success: novnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,804 INFO success: pulseaudio entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,804 INFO success: x11vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,804 INFO success: audiowebsock entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,804 INFO success: desktop entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:10,804 INFO success: sunshine entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-06-27 02:51:12,208 INFO reaped unknown pid 942 (exit status 1)
I can confirm that downgrading the Nvidia driver from v535.54.03 to v530.41.03 did the trick for me too with a GTX 1050 Ti.
I feel like it is still just a workaround for some underlying issue.
Downgrading worked for me as well, I had to restrict nvidia drivers on arch to prevent upgrades.
I'm still experiencing Xorg failing on startup when using driver v530.41.03 on a GTX 1080ti. Xorg and x11vnc just get caught in a constant loop of starting and exiting with error code 1.
Searched everywhere for why this wasnt working! Downgrade to 350.41.03 worked with my Geforce 1660 Ti too
Edit: aaaaand, its stopped working again.
It was super slow, restarted the container and I'm getting the loop of starting and exiting again! So close!
Edit 2: A couple of reboots got it going again but its very flaky. I can't set my download location to the array in the steam settings without it crashing out. Colours also looks weirdly washed out - grey ish (not that I care about colour in the VNC, just something I've noticed if it means anything to anyone) So far although its working, its not use because I can't change the download location so it'll be downloading games onto the image and blow up my docker image..
Should be now fixed in latest build. Re-open this issue if it is not.
Does work for me. Thanks!
Still failing to start Xorg. Seeing in the Xorg logs under '/home/default/.cache/log' that it's failing because "(EE) no screens found(EE)". Since this is headless, what screen would it be looking for?
(II) NIDIA(0): NVIDIA GPU NVIDIA GeForce GTX 1080 Ti (GP102-A) at PCI: 7:0:0 II) NVIDIA(0) : (GPU-0) -) NVIDIA (0) : Memory: 11534336 kBytes (--) NVIDIA(0): VideoBIOS: 86.02.40.00.1a (II) NVIDIA (0): Detected PCI Express Link width: 16X (EF) NVIDIA (GPU-0): Failed to acquire modesetting permission. (EE) NVIDIA(0): Failing initialization of X screen (II) Unloading gixserver nvidia (EE) Screen(s) found, but none have a usable configuration. (EE) Fatal server error: (EE) no screens found (EE) (EE) Please consult the The X.Org Foundation support at http://wiki.x.org for help. (EE) Please also check the log file at "/var/log/Xorg.55.log" for additional information. (EE) (EE) Server terminated with error (1). Closing log file.
I've tested this on Unraid v6.11.5 with NVIDIA Driver v535.54.03, Intel 11th Gen and RTX 2060 with no monitor or dummy plug connected. And im getting fine results. I wonder if the older gtx worked a different way.
@cravev in your .env file try adding the following and see if it fixes it for you:
DISPLAY_VIDEO_PORT='dp-0'
I may just be dumb. Had an Xorg server running on the base Arch install for a GUI desktop and that was causing the docker Xorg to not start I believe. I have it working now, albeit I have no desktop but I barely was using that anyway. Thanks @HarmonyTechLabs !!
Describe the Bug
I've switched steam headless from some old Nvidia 1070s to newer Nvidia A2000s. Getting this error now. I've tried plugging in displays as a test. Honestly one cable may be bad, so waiting on fake dongles. Though I thought it could run headless?
Steps to Reproduce
Expected Behavior
X starts
Screenshots
No response
Relevant Settings
No response
Version
Build: [2023-06-24 03:10:18] [master] [6a2d55196b5253fef520ca883391d783716b4d06]
Platform
docker 6.1.35-1-lts 530 k8s with nvidia container toolkit
Relevant log output