ValveSoftware / SteamVR-for-Linux

Issue tracker for the Linux port of SteamVR
918 stars 45 forks source link

[BUG] SteamVR error causes my entire Desktop Environment and GPU to crash (libcef.so error 4 & VRMonitor cannot find libraries) #489

Open Incuh opened 2 years ago

Incuh commented 2 years ago

For a couple weeks now, I have been experiencing an issue with SteamVR. Whenever I launch SteamVR, my entire screen goes black, I get a no signal error on my monitor for a few seconds, then my screens goes back to black with a blinking cursor until I finally get booted back into my display manager's login screen as if it reset.

This happens with every version of SteamVR I try This happens with the latest new feature branch of the NVIDIA driver & the latest stable branch I am using ALVR, this issue is consistent across every version of ALVR I've tried and from what I can find from other users, is most likely not an ALVR issue.

I've gone as far to re-install Arch Linux as a troubleshooting measure This issue is still persistent

After a crash, i took a look at journalctl and found some errors: vrmonitor is the first application to crash It gives this error: SteamVR/bin/linux64/vrmonitor: error while loading shared libraries: libQt5Multimedia.so.5: cannot open shared object file: No such file or directory This is strange as that file seems to exist here: ~/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/qt/lib/libQt5Multimedia.so.5

The next application to fully crash is VR Web Helper Its stack trace is here:

Stack trace of thread 5291:
                                               #0  0x00007f93a82e1694 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x31f6694)
                                               #1  0x00007f93a82a2c34 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x31b7c34)
                                               #2  0x00007f93a9586541 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x449b541)
                                               #3  0x00007f93a9586647 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x449b647)
                                               #4  0x00007f93a94127b0 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x43277b0)
                                               #5  0x00007f93a9412513 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x4327513)
                                               #6  0x00007f93a74a0450 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libcef.so + 0x23b5450)
                                               #7  0x0000000000524702 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/vrwebhelper + 0x124702)
                                               ELF object binary architecture: AMD x86-64

At one point, I noticed an error along the lines of libcef.so error 4 I can't find it anymore

Steamtours dumps it core as the next entry in journalctl after vrwebhelper

#0  0x00007f9c12aa053c n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libtier0.so + 0x4553c)
                                               #1  0x00007f9c12aa1074 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libtier0.so + 0x46074)
                                               #2  0x00007f9c12aa13c0 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libtier0.so + 0x463c0)
                                               #3  0x00007f9c121f0c8c n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x58ac8c)
                                               #4  0x00007f9c121f13e5 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x58b3e5)
                                               #5  0x00007f9c11f8d25b n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x32725b)
                                               #6  0x00007f9c11f8fda6 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x329da6)
                                               #7  0x00007f9c11f901a7 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x32a1a7)
                                               #8  0x00007f9c11ec6deb n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x260deb)
                                               #9  0x00007f9c11ec7313 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/libengine2.so + 0x261313)
                                               #10 0x0000563829a03080 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/steamtours + 0x3080)
                                               #11 0x0000563829a02b0a n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/steamtours + 0x2b0a)
                                               #12 0x00007f9c128b4b25 __libc_start_main (libc.so.6 + 0x27b25)
                                               #13 0x0000563829a02d39 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/tools/steamvr_environments/game/bin/linuxsteamrt64/steamtours + 0x2d39)
                                               ELF object binary architecture: AMD x86-64

This error comes right after: ....../game/steamtours.sh: line 94: 5301 Segmentation fault (core dumped) ${STEAM_RUNTIME_PREFIX} ${GAME_DEBUGGER} "${GAMEROOT}"/${GAMEEXE} "$@"

The next errors are copied directly from journalctl, in order: I no longer have access to these dumps, sorry

Dec 27 17:13:21 UwUnix steam-native.desktop[5366]: crash_20211227171320_2.dmp[5366]: Finished uploading minidump (out-of-process): success = yes
Dec 27 17:13:21 UwUnix steam-native.desktop[5366]: crash_20211227171320_2.dmp[5366]: response: CrashID=bp-80ad5e3f-30d1-49f2-b93c-2d1182211227
Dec 27 17:13:21 UwUnix steam-native.desktop[5366]: crash_20211227171320_2.dmp[5366]: file ''/tmp/dumps/crash_20211227171320_2.dmp'', upload yes: ''CrashID=bp-80ad5e3f-30d1-49f2-b93c-2d1182211227''
Dec 27 17:13:21 UwUnix crash_20211227171320_2.dmp[5366]: Finished uploading minidump (out-of-process): success = yes
Dec 27 17:13:21 UwUnix crash_20211227171320_2.dmp[5366]: response: CrashID=bp-80ad5e3f-30d1-49f2-b93c-2d1182211227
Dec 27 17:13:21 UwUnix crash_20211227171320_2.dmp[5366]: file ''/tmp/dumps/crash_20211227171320_2.dmp'', upload yes: ''CrashID=bp-80ad5e3f-30d1-49f2-b93c-2d1182211227''
Dec 27 17:13:24 UwUnix steam-native.desktop[5397]: assert_20211227171324_4.dmp[5397]: Uploading dump (out-of-process)
Dec 27 17:13:24 UwUnix steam-native.desktop[5397]: /tmp/dumps/assert_20211227171324_4.dmp
Dec 27 17:13:24 UwUnix assert_20211227171324_4.dmp[5397]: Uploading dump (out-of-process)
                                                          /tmp/dumps/assert_20211227171324_4.dmp
Dec 27 17:13:24 UwUnix steam-native.desktop[4353]: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Dec 27 17:13:24 UwUnix discord.desktop[1260]: X connection to :0 broken (explicit kill or server shutdown).
Dec 27 17:13:24 UwUnix at-spi-bus-launcher[905]: X connection to :0 broken (explicit kill or server shutdown).
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Wacom.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix su[3788]: pam_unix(su:session): session closed for user root
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.MediaKeys.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix systemd[697]: vte-spawn-815b308d-15ad-4344-bcb5-6acf91548332.scope: Consumed 1.915s CPU time.
Dec 27 17:13:24 UwUnix steam-native.desktop[5024]: X connection to :0 broken (explicit kill or server shutdown).
Dec 27 17:13:24 UwUnix steam-native.desktop[5024]: terminate called without an active exception
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Wacom.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Keyboard.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix systemd[697]: gnome-terminal-server.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix systemd[697]: gnome-terminal-server.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.MediaKeys.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Keyboard.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.XSettings.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Color.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Color.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.XSettings.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix steam-native.desktop[5024]: 17:13:24.440910268 [INFO] Sending shutdown signal to vrserver.
Dec 27 17:13:24 UwUnix steam-native.desktop[5024]: Mon Dec 27 2021 17:13:24.440993 - alvr_server: Sending shutdown signal to vrserver.
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Power.service: Main process exited, code=exited, status=1/FAILURE
Dec 27 17:13:24 UwUnix systemd[697]: org.gnome.SettingsDaemon.Power.service: Failed with result 'exit-code'.
Dec 27 17:13:24 UwUnix discord.desktop[1260]: Parent failed to complete crash dump.
Dec 27 17:13:24 UwUnix steam-native.desktop[5239]: X connection to :0 broken (explicit kill or server shutdown).
Dec 27 17:13:24 UwUnix steam-native.desktop[5239]: X connection to :0 broken (explicit kill or server shutdown).
Dec 27 17:13:24 UwUnix steam-native.desktop[5239]: free(): corrupted unsorted chunks
Dec 27 17:13:24 UwUnix steam-native.desktop[5415]: crash_20211227171324_5.dmp[5415]: Uploading dump (out-of-process)
Dec 27 17:13:24 UwUnix steam-native.desktop[5415]: /tmp/dumps/crash_20211227171324_5.dmp
Dec 27 17:13:24 UwUnix crash_20211227171324_5.dmp[5415]: Uploading dump (out-of-process)
                                                         /tmp/dumps/crash_20211227171324_5.dmp

After this, the main thread proceeds to crash

#0  0x00007fa780dedd22 raise (libc.so.6 + 0x3cd22)
                                               #1  0x00007fa780dd7862 abort (libc.so.6 + 0x26862)
                                               #2  0x00007fa73b6051c2 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/drivers/gamepad/bin/linux64/driver_gamepad.so + 0x171c2)
                                               #3  0x00007fa780df04a7 __run_exit_handlers (libc.so.6 + 0x3f4a7)
                                               #4  0x00007fa780df064e exit (libc.so.6 + 0x3f64e)
                                               #5  0x00007fa77d3428f2 _XDefaultIOError (libX11.so.6 + 0x438f2)
                                               #6  0x00007fa77d342bff _XIOError (libX11.so.6 + 0x43bff)
                                               #7  0x00007fa77d3402a7 _XEventsQueued (libX11.so.6 + 0x412a7)
                                               #8  0x00007fa77d321beb XFlush (libX11.so.6 + 0x22beb)
                                               #9  0x00007fa781452252 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/libSDL2-2.0.so.0 + 0xda252)
                                               ELF object binary architecture: AMD x86-64

VRcompositor crashes finally

Stack trace of thread 5281:
                                               #0  0x00007fc4e62a3d22 raise (libc.so.6 + 0x3cd22)
                                               #1  0x00007fc4e628d862 abort (libc.so.6 + 0x26862)
                                               #2  0x0000555f3f152233 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrcompositor.real + 0x53233)
                                               #3  0x0000555f3f230495 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrcompositor.real + 0x131495)
                                               #4  0x0000555f3f45b1e0 n/a (/home/Incuh/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrcompositor.real + 0x35c1e0)
                                               ELF object binary architecture: AMD x86-64

This was all I was able to dig up

How to reproduce: Install Arch Linux with a NVIDIA 1660 Ti Install ALVR Install Steam and SteamVR Start ALVR (which will cause SteamVR to launch) Watch as ALVR proceeds to start fine but for a mystery box to appear and for every thing to go black and crash

Expected behavior Normally, the ALVR window opens fine with no artifacts on the desktop

System Information:

Something to note: I am under Xorg however, as a troubleshooting measure, I tried it under GNOME Wayland Although this didn't cause it to work, it did stop the entire desktop environment from crashing

Vulkaninfo and vkcube are fully functional A vulkan info output is attached below

I am using steam native But this issue still happens on Steam Runtime maxconcern.txt

Edit: The artifacts on the desktops were unrelated

kiosion commented 2 years ago

Just started experiencing this same issue today. I'm also running Arch with Xorg, however I'm not using a DE or greeter so I simply get thrown back to the TTY login whenever I try to launch SteamVR. I'm really out of ideas as to what could be causing it as I started experiencing this issue randomly, not after updating anything. After looking through my logs it does seem to be the same issue - vrmonitor appears to crash first, also complaining about a "missing" qt library that I do have installed.

Supreeeme commented 2 years ago

Did it work in the past for you? Might be irrelevant to your issue but I noticed (perhaps recently) SteamVR would take out Xorg for me if I didn't have my DisplayPort (connected to my link box, HTC Vive) properly plugged in.

kiosion commented 2 years ago

It worked perfectly in the past, never ran into this issue before. I did double-check my physical connections although again, this didn't start happening after a hardware/software upgrade. I found starting SteamVR with the HMD physically disconnected, then plugging it in and clicking 'Restart Headset' has it launch correctly - What this actually does and why SteamVR crashes Xorg if this isn't done, I have no clue.

Incuh commented 2 years ago

Hello I have made some progress

First of all, after switching to the beta, I don't get library loading issues Yay

Xorg still crashes

I could only find 2 errors in journalctl that looked interesting

Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-0: disconnected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-0: Internal DisplayPort
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-0: 2660.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): Acer H203H (DFP-1): connected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): Acer H203H (DFP-1): Internal TMDS
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): Acer H203H (DFP-1): 165.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (EE) event7  - Logitech M510: client bug: event processing lagging behind by 22ms, your system is too slow
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-2: disconnected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-2: 2660.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-3: disconnected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): Dell SE2717H/HX (DFP-4): connected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): Dell SE2717H/HX (DFP-4): Internal TMDS
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): Dell SE2717H/HX (DFP-4): 600.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-5: disconnected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-5: 2660.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-6: disconnected
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-6: Internal TMDS
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0): DFP-6: 165.0 MHz maximum pixel clock
Jan 12 02:11:26 archlinux /usr/lib/gdm-x-session[1060]: (--) NVIDIA(GPU-0):

followed later by

Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Backtrace:
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) 0: /usr/lib/Xorg (xorg_backtrace+0x89) [0x55f8864b8049]
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) 1: /usr/lib/Xorg (0x55f886368000+0x15ae69) [0x55f8864c2e69]
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) 2: /usr/lib/libpthread.so.0 (0x7fca745fc000+0x13870) [0x7fca7460f870]
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Segmentation fault at address 0x0
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: Fatal server error:
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Caught signal 11 (Segmentation fault). Server aborting
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: Please consult the The X.Org Foundation support
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]:          at http://wiki.x.org
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]:  for help.
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Please also check the log file at "/var/log/Xorg.1.log" for additional information.
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)

I'm not even sure anymore

okawo80085 commented 2 years ago

Can you check if vrcompositor crashed with a sigsegv prior to your DE's crash? On a rare occasion when vrcompositor crashes with sigsegv it takes my DE session with it

Bitwolfies commented 2 years ago

Same here, on an arch based distro, no idea whats causing this. No idea what ALVR is, so its not the issue. I can't even make it boot with the headset unplugged.

Bitwolfies commented 2 years ago

Attempted moving between steams runtime and native, no change, kinda upsetting one major issue gets fixed (in theory) just to be smacked down with this one.

kiosion commented 2 years ago
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Backtrace:
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) 0: /usr/lib/Xorg (xorg_backtrace+0x89) [0x55f8864b8049]
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) 1: /usr/lib/Xorg (0x55f886368000+0x15ae69) [0x55f8864c2e69]
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) 2: /usr/lib/libpthread.so.0 (0x7fca745fc000+0x13870) [0x7fca7460f870]
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Segmentation fault at address 0x0
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: Fatal server error:
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Caught signal 11 (Segmentation fault). Server aborting
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: Please consult the The X.Org Foundation support
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]:          at http://wiki.x.org
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]:  for help.
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE) Please also check the log file at "/var/log/Xorg.1.log" for additional information.
Jan 12 02:11:28 archlinux /usr/lib/gdm-x-session[1060]: (EE)

I'm not even sure anymore

Interesting, I don't get that same error when Xorg crashes, if I remember correctly it was "Fatal IO error 6: No such device or address on X server ':0'" followed by what looked like a normal log of X quitting. My journalctl looks nearly identical in terms of display connects/disconnects though; I'll look through and post my logs here when I have the chance, hopefully there's something useful in them.

jorams commented 2 years ago

I ran into this after updating to the newest beta. Reverting to the latest release did not fix the issue, but unplugging and re-plugging the headset (fully, including power) before starting SteamVR did. I can now run the newest beta without crashes.

The crashes showed the following warnings in journalctl, followed by vrmonitor and every other process under X crashing and dumping core:

kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception on GPC 0: 3D HEIGHT CT Violation. Coordinates: (0xf6, 0x590)
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception: ESR 0x500420=0x80000020 0x500434=0x59000f6 0x500438=0x1100 0x5004 3c=0x0
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception on GPC 1: 3D HEIGHT CT Violation. Coordinates: (0xd4, 0x570)
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception: ESR 0x508420=0x80000020 0x508434=0x57000d4 0x508438=0x1100 0x5084 3c=0x0
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception on GPC 2: 3D HEIGHT CT Violation. Coordinates: (0xf2, 0x58c)
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception: ESR 0x510420=0x80000020 0x510434=0x58c00f2 0x510438=0x1100 0x5104 3c=0x0
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception on GPC 3: 3D HEIGHT CT Violation. Coordinates: (0xf0, 0x578)
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=773, Graphics Exception: ESR 0x518420=0x80000020 0x518434=0x57800f0 0x518438=0x1100 0x5184 3c=0x0
kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=1671, Graphics Exception: ChID 0036, Class 0000b197, Offset 000034a8, Data 80000000
Bitwolfies commented 2 years ago

Replying to https://github.com/ValveSoftware/SteamVR-for-Linux/issues/489#issuecomment-1018952762

Yep, that fixed it alright, thanks!

Now if only steamvr for linux wasnt a trashpile in general we'd get somewhere

kiosion commented 2 years ago

unplugging and re-plugging the headset (fully, including power)

Unfortunately this is only a temporary fix for me; as soon as I restart Steam or SteamVR the issue returns... I'd love to know what the root of this issue is, as even the old 1.14 SteamVR beta has this issue, which it didn't used to...

Edit: Dug a bit deeper into my logs last night, and checked out my journalctl. can confirm that vrmonitor is the first to crash for me, followed by vrwebhelper, vrdashboard, and then Xorg and every process under it. In the past, I had quite a few issues with vrdashboard/vrwebhelper simply not launching at all; I wonder if this issue is somehow related to #465 ?

snickell commented 2 years ago

I'm seeing this too, on Fedora 35

Incuh commented 2 years ago

I've noticed this issue also happens with some normal applications on Proton. Specifically it happened with an electron based steam app over Proton Experimental.

This may beyond SteamVR and more like a issue for people like us with this hardware config.

snickell commented 2 years ago

@lncuh do you have a sense what it is about our hardware config that's making us prone to this? "nvidia on linux" seems like a pretty broad hardware config

Incuh commented 2 years ago

So, uh, SteamVR began to randomly work with no crashes for no reason. Amazing

Arch Linux + GNOME + X11 On NVIDIA 510.54 SteamVR stable

I can't tell you much, except that disabling Async Reprojection & Linux Vulkan Async MAY fix it (k_pch_SteamVR_DisableAsyncReprojection_Bool: True) (k_pch_SteamVR_EnableLinuxVulkanAsync_Bool: False)

kiosion commented 2 years ago

Same here; not sure when exactly it started working but it's been several days now. Still other issues with vrdash etc., but it crashing xorg seems to have stopped. I haven't changed anything about my setup except updating my video drivers and kernel a few days back - Maybe this wasn't an issue with SteamVR to begin with?

FWIW, I'm also on Arch, with bspwm, x11, and xanmod kernel 5.15.25 NVIDIA driver version 510.54-3 Latest SteamVR beta

Incuh commented 2 years ago

It seems that the crashing fix was not related to any change in ALVR as old versions do not have the issue anymore.

It likely isn't: Async Reprojection or ALVR

DASPRiD commented 2 years ago

Just noting down that the same issue started for me yesterday. I'm running NVIDIA driver 510.47.03, which was not updated before the issue started.

The only correlating thing is that SteamVR Hotfix 1.21.12 was released on Wednesday, after which I started to experience the crashes.

What's really weird though is that after the crash happened today and I wanted to restart my PC, it didn't manage to get any further than the UEFI splash screen. I had to turn off the power supply for several seconds to get it boot normally again. This makes me thing that something is setting the GPU into a weird state, which only a complete power-off can resolve.

snickell commented 2 years ago

That caught my eye too @DASPRiD but I didn't put the pieces together, I think you're onto the scent 🙇🏾‍♂️, with the thinking that something is popping a really not-compatible-with-nvidia-driver-510.~50 register that flips the GPU out. Perhaps we should list what GPUs we're using that are seeing this?

I'm using a 1080ti with an i7 processor. For me this issue is so recurrent, and the need-to-reboot-by-hard-power-cycling annoying enough that it has seriously deterred me from continuing VR development on steam for linux.... I just can't have the risk of random 5 minute to reboot crashes that make me get up phhhhhysically and unnnnplug the pcccc 😩😢🥱🤣 And then there's the disk scans, and its a big array 😱 Its basically stalled dev for me on steamvr for linux 🤔

Another common factor seems to be: NVIDIA driver == 510.[4|5]x 🤔

snickell commented 2 years ago

@kisak-valve is there an appropriate way to flag this issue for a quick 👁 by a steamvr for linux developer w/ nvidia gpu driver knowledge? I believe there is finally enough info gathered in this issue to isolate the problem with some knowledge of the code on the other side 🤙🏽

kiosion commented 2 years ago

Perhaps we should list what GPUs we're using that are seeing this?

I'm running a 2070 with an i7 8700k.

Haven't had this issue for a few weeks now, but would love to know what was/is causing it.

yasarabio commented 2 years ago

Hi! In case it helps, we got exactly the same X-server crash and backtrace involving libpthread and the nVIDIA driver (see below). This happened reproducibly after booting the machine to Windows, using SteamVR there, and then going back to Linux, and starting SteamVR. Interestingly, the problem could be solved reproducibly by reinstalling the nVIDIA driver and not touching Steam. Since nVIDIA's driver is in the backtrace, this should probably be reported to them... This is CentOS 7 with a Geforce RTX 2080 and driver 515.48.07. Best regards, Elmar

[ 20.699] (EE) Backtrace: [ 20.699] (EE) 0: /usr/bin/X (xorg_backtrace+0x55) [0x55fdd6d01555] [ 20.699] (EE) 1: /usr/bin/X (0x55fdd6b50000+0x1b51d9) [0x55fdd6d051d9] [ 20.699] (EE) 2: /lib64/libpthread.so.0 (0x7fb8a43e9000+0xf630) [0x7fb8a43f8630] [ 20.699] (EE) 3: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7fb89fdf7000+0x5bbc5c) [0x7fb8a03b2c5c] [ 20.699] (EE) 4: /usr/bin/X (0x55fdd6b50000+0x8dea9) [0x55fdd6bddea9] [ 20.699] (EE) 5: /usr/bin/X (xf86LoadModules+0xa8) [0x55fdd6bed178] [ 20.699] (EE) 6: /usr/bin/X (InitOutput+0x79f) [0x55fdd6bed98f] [ 20.699] (EE) 7: /usr/bin/X (0x55fdd6b50000+0x60e30) [0x55fdd6bb0e30] [ 20.699] (EE) 8: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fb8a403d555] [ 20.699] (EE) 9: /usr/bin/X (0x55fdd6b50000+0x4b11e) [0x55fdd6b9b11e] [ 20.699] (EE) [ 20.699] (EE) Segmentation fault at address 0x0 [ 20.700] (EE) Fatal server error: [ 20.700] (EE) Caught signal 11 (Segmentation fault). Server aborting

NoSadBeHappy commented 1 week ago

I tried it under GNOME Wayland Although this didn't cause it to work, it did stop the entire desktop environment from crashing

I also have this issue on fedora 40.