ValveSoftware / SteamVR-for-Linux

Issue tracker for the Linux port of SteamVR
918 stars 45 forks source link

[BUG] OpenXR: Applications hang instead of quit #422

Open ChristophHaag opened 3 years ago

ChristophHaag commented 3 years ago

SteamVR 1.16.4, but I believe all 1.15.x versions have been affected too

For example hello_xr

[Thread 0x7ffff7657640 (LWP 3102287) exited]
Tue Jan 26 2021 23:55:18.829998 - Changing app type from VRApplication_OpenXRScene to VRApplication_OpenXRInstance
Tue Jan 26 2021 23:55:18.830208 - Successfully changed app type
^C
Thread 1 "hello_xr" received signal SIGINT, Interrupt.
--Type <RET> for more, q to quit, c to continue without paging--
0x00007ffff782709e in pthread_cond_destroy@@GLIBC_2.3.2 () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff782709e in pthread_cond_destroy@@GLIBC_2.3.2 () from /usr/lib/libc.so.6
#1  0x00007ffff6999db9 in SteamThreadTools::CThreadSyncObject::~CThreadSyncObject() [clone .part.0] () from /home/haagch-collabora/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrclient.so
#2  0x00007ffff6926e14 in CHmdSystemLatest::~CHmdSystemLatest() () from /home/haagch-collabora/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrclient.so
#3  0x00007ffff77e13ae in __cxa_finalize () from /usr/lib/libc.so.6
#4  0x00007ffff686e273 in __do_global_dtors_aux () from /home/haagch-collabora/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrclient.so
#5  0x00007ffff6e1f3e0 in __gnu_cxx::category_names () from /home/haagch-collabora/.local/share/Steam/steamapps/common/SteamVR/bin/linux64/vrclient.so
#6  0x00007ffff7fe5e52 in call_destructors () from /lib64/ld-linux-x86-64.so.2
#7  0x00007ffff78dc0e5 in _dl_catch_exception () from /usr/lib/libc.so.6
#8  0x00007ffff7fe63de in _dl_close_worker () from /lib64/ld-linux-x86-64.so.2
#9  0x00007ffff7fe70a2 in _dl_close () from /lib64/ld-linux-x86-64.so.2
#10 0x00007ffff78dc088 in _dl_catch_exception () from /usr/lib/libc.so.6
#11 0x00007ffff78dc153 in _dl_catch_error () from /usr/lib/libc.so.6
#12 0x00007ffff779cb89 in ?? () from /usr/lib/libdl.so.2
#13 0x00007ffff779c478 in dlclose () from /usr/lib/libdl.so.2
#14 0x00007ffff7f0f3bf in ?? () from /usr/lib/libopenxr_loader.so.1
#15 0x00007ffff7f1007c in ?? () from /usr/lib/libopenxr_loader.so.1
#16 0x00007ffff7ef833a in xrDestroyInstance () from /usr/lib/libopenxr_loader.so.1
#17 0x000055555557313f in ?? ()
#18 0x00005555555632f7 in main ()
kwahoo2 commented 3 years ago

I can confirm this in an another demo https://github.com/kwahoo2/sdl2oxr

./sdl2oxr 
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Completed loader terminator
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Completed loader trampoline
2021-02-21 20:24:16.288 DEBUG: xrCreateInstance: Completed loader trampoline
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Entering loader trampoline
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Entering loader terminator
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Completed loader terminator
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Completed loader terminator
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Completed loader trampoline
2021-02-21 20:24:16.288 DEBUG: xrCreateDebugUtilsMessengerEXT: Completed loader trampoline
2021-02-21 20:24:16.288 INFO: OpenXR Runtime SteamVR/OpenXR version 0.1.0
2021-02-21 20:24:16.288 INFO: OpenXR System SteamVR/OpenXR : lighthouse max layers 16 max swapchain image size 2016x2240
2021-02-21 20:24:16.469 INFO: Visual id: 1233
2021-02-21 20:24:17.015 INFO:   RGBA16
2021-02-21 20:24:17.015 INFO:   RGBA16F
2021-02-21 20:24:17.015 INFO:   RGB16F
2021-02-21 20:24:17.015 INFO:   SRGB8
2021-02-21 20:24:17.015 INFO:   SRGB8_ALPHA8
Shader Stats: SGPRS: 8 VGPRS: 28 Code Size: 52 LDS: 0 Scratch: 0 Max Waves: 18 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Shader Stats: SGPRS: 40 VGPRS: 12 Code Size: 332 LDS: 0 Scratch: 0 Max Waves: 20 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Shader Stats: SGPRS: 32 VGPRS: 12 Code Size: 160 LDS: 0 Scratch: 0 Max Waves: 20 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Shader Stats: SGPRS: 8 VGPRS: 28 Code Size: 20 LDS: 0 Scratch: 0 Max Waves: 18 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Shader Stats: SGPRS: 24 VGPRS: 12 Code Size: 224 LDS: 0 Scratch: 0 Max Waves: 20 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Shader Stats: SGPRS: 16 VGPRS: 28 Code Size: 64 LDS: 0 Scratch: 0 Max Waves: 18 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Shader Stats: SGPRS: 16 VGPRS: 28 Code Size: 60 LDS: 0 Scratch: 0 Max Waves: 18 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
2021-02-21 20:24:22.192 DEBUG: xrDestroyDebugUtilsMessengerEXT: Entering loader trampoline
2021-02-21 20:24:22.192 DEBUG: xrDestroyDebugUtilsMessengerEXT: Entering loader trampoline
2021-02-21 20:24:22.192 DEBUG: xrDestroyDebugUtilsMessengerEXT: Entering loader terminator
2021-02-21 20:24:22.192 DEBUG: xrDestroyDebugUtilsMessengerEXT: Entering loader terminator
2021-02-21 20:24:22.192 DEBUG: xrDestroyDebugUtilsMessengerEXT: Completed loader terminator
2021-02-21 20:24:22.193 DEBUG: xrDestroyDebugUtilsMessengerEXT: Completed loader trampoline
2021-02-21 20:24:22.193 DEBUG: xrDestroyInstance: Entering loader trampoline
2021-02-21 20:24:22.193 DEBUG: xrDestroyDebugUtilsMessengerEXT: Entering loader trampoline
2021-02-21 20:24:22.193 DEBUG: xrDestroyDebugUtilsMessengerEXT: Entering loader terminator

Hangs at the same point xrDestroyInstance.

SteamVR Version 1.15.19 (1610416788)

Breush commented 3 years ago

Having the same issue SteamVR 1.16.10.

amalon commented 3 years ago

Anybody know of a workaround for this bug, as I can't find one? (are any known OpenXR apps actually capable of shutdown without hitting this?)

kwahoo2 commented 3 years ago

Anybody know of a workaround for this bug,

Try SteamVR linux_v1.14 from the beta tab.

amalon commented 3 years ago

Anybody know of a workaround for this bug,

Try SteamVR linux_v1.14 from the beta tab.

Thanks, but i haven't got anything to display to the HMD with that version. Just lots of: Thu Aug 19 2021 16:42:33.967113 - xrWaitFrame: would have blocked, skipping frame instead and shouldRender=false

Shutdown does at least not hang though (a shallow victory :) )

kwahoo2 commented 3 years ago

xrWaitFrame: would have blocked, skipping frame instead

I don't know what application you use for testing, but if you have access to the source code, check if there is Ready session state handling, eg.:

    switch (sessionState) {
    case xr::SessionState::Ready: //this is not necessary for Monado, but SteamVR would stuck at Ready state if xrBeginFrame wasn't called
    case xr::SessionState::Focused:
    case xr::SessionState::Synchronized:
    case xr::SessionState::Visible:
        session.waitFrame(xr::FrameWaitInfo{}, frameState);
        return xr::Result::Success == session.beginFrame(xr::FrameBeginInfo{});
    default:
        break;
    }
amalon commented 3 years ago

xrWaitFrame: would have blocked, skipping frame instead

I don't know what application you use for testing

NVIDIA 470.63.01 driver

but if you have access to the source code, check if there is Ready session state handling, eg.:

Yeh I'd already noticed SteamVR wouldn't move on from ready without frames being processed. that doesn't seem to be the problem here. The message definitely seems to come from xrWaitFrame itself (the steamvr log only seems to show when its being ptraced, e.g. by gdb or ltrace), & its after the state becomes focused.

Does linux_v1.14's OpenXR still work for others?

ChristophHaag commented 3 years ago

Only SteamVR versions that are new enough to have async reprojection for nvidia enabled will have working openxr for nvidia https://steamcommunity.com/app/250820/discussions/8/2448217763555811264/

amalon commented 3 years ago

Only SteamVR versions that are new enough to have async reprojection for nvidia enabled will have working openxr for nvidia

thank for confirmation of that. guess I''m stuck waiting on valve or switching to amd gpu to fix this then (I can't even run monado with latest proprietary nvidia due to driver crashes) :-(

LordOfDragons commented 2 years ago

I'm seeing the same problem. I used xrRequestExitSession and polled events until I'm at XR_SESSION_STATE_EXITING before going home. Still SteamVR hangs with a similar error as mentioned by the OP:

https://pastebin.com/2ej4BPAV

This is the latest beta version of SteamVR.

pavelk0508 commented 1 year ago

This bug has been around for two years now and I would like to know when it will go to work, but now at least some solution is needed

pavelk0508 commented 1 year ago

Is there a rational way to delete a session with the possibility of re-creation if there is no way to fix this critical bug?

pavelk0508 commented 1 year ago

My research led to the fact that in vrclient.so a destructor is called, which actually hangs the process, and in the presence of a picture in the VR headset

pavelk0508 commented 1 year ago

Also found out that it is unloading vrclient.so that removes XrInstance. While calling XrDestroyInstance does nothing

pavelk0508 commented 1 year ago

@kisak-valve The problem is in the destructor of the vrclient.so library itself, which is causing the hang. Deleting the dlclose method inside the OpenXRLoader does not reproduce the problem, but the XrInstance is not destroyed either. Please fix it in the next patch, as it is very necessary. Most likely there is some kind of check inside the destructor, on which everything hangs. If there is no way to fix it now, then I would like to know what checks are there in order to remove everything that may cause a hang.

Perhaps this is due to the created threads that are not unloaded on the dlclose SteamThreadTools::CThreadSyncObject::~CThreadSyncObject() CHmdSystemLastest::~CHmdSystemLatest() SteamThreadTools::CThreadSyncObject::Wait(unsigned int) CSyncLockThread::Run() SteamThreadTools::CThread::ThreadProc(void*)

LordOfDragons commented 1 year ago

Threads not closing typically means a thread-join never succeeding since the thread does not want to exit. And since a sync object is involved I would say a semaphore or barrier is waited upon by the thread but never notified.

pavelk0508 commented 1 year ago

@LordOfDragons based on what I saw, most likely these are synchronization threads with SteatVR. Ideally, they should be destroyed at the moment when either the session breaks (XrDestroySession) or when the instance is destroyed (XrDestroyInstance)

pavelk0508 commented 1 year ago

Perhaps there is a way to destroy threads inside vrclient.so, but I didn’t find such a way during reverse engineering (perhaps not everything was deployed correctly). You need to look in the original code, which, unfortunately, is not available

LordOfDragons commented 1 year ago

@pavelk0508 That's for Valve to fix. But this bug does not seem to have a high enough ranking in their books it seems.

pavelk0508 commented 1 year ago

@LordOfDragons based on how long this bug has been here, they don’t care at all)

sdegrande commented 1 year ago

Same annoying issue for my application. Avoiding to call xrDestroyInstance() on Linux is not enough. It hangs at code exit, and so I need to kill the app (ctrl-c)...

pthread_cond_destroy@@GLIBC_2.3.2() at 0x7ffff48accd5   
SteamThreadTools::CThreadSyncObject::~CThreadSyncObject() at 0x7fffeaa226c1 
CVRCompositorClientBase::~CVRCompositorClientBase() at 0x7fffea992daa   
CHmdSystemLatest::~CHmdSystemLatest() at 0x7fffea986749 
__run_exit_handlers() at 0x7ffff48620b5 
exit() at 0x7ffff4862230    
pavelk0508 commented 1 year ago

@sdegrande Yes, of course, because of this, but I have already unsubscribed everywhere, but on Steam they don’t hear me and they obviously don’t want to fix it

pavelk0508 commented 1 year ago

New data - probably threads are not being killed due to an internal SteamVR error that pops up in debug mode


Wed Apr 05 2023 15:23:24.833485 - Transitioning XrSession from XR_SESSION_STATE_VISIBLE to XR_SESSION_STATE_SYNCHRONIZED
Wed Apr 05 2023 15:23:24.833516 - Transitioning XrSession from XR_SESSION_STATE_SYNCHRONIZED to XR_SESSION_STATE_STOPPING
Wed Apr 05 2023 15:23:24.833544 - Transitioning XrSession from XR_SESSION_STATE_STOPPING to XR_SESSION_STATE_IDLE
Wed Apr 05 2023 15:23:24.833567 - Transitioning XrSession from XR_SESSION_STATE_IDLE to XR_SESSION_STATE_EXITING
Wed Apr 05 2023 15:23:29.006975 - Changing app type from VRApplication_OpenXRScene to VRApplication_OpenXRInstance
Wed Apr 05 2023 15:23:29.007085 - Received message of type 1107 when 121 was expected on pipe VR_ServerPipe_57517
Wed Apr 05 2023 15:23:29.007112 - IPC failure changing app type
Wed Apr 05 2023 15:23:29.007133 - Unexpected error while trying to change the OpenXR application type: 
[VRInitError_Init_Internal]```
pavelk0508 commented 1 year ago

I also noticed that after disconnecting from SteamVR, for some reason it connects again

Wed Apr 05 2023 16:26:15.233108 - AppInfoManager.ProcessQuit processid=88060 eLaunchingApp=LaunchingApp_None
Wed Apr 05 2023 16:26:15.233161 - AppInfoManager.ProcessQuit: Clearing application system.generated.openxr.helloxr.hello_xr PID was 88060
Wed Apr 05 2023 16:26:15.233184 - AppInfoManager.ProcessQuit: Clearing application system.generated.openxr.helloxr.hello_xr PID because 88060 has exited
Wed Apr 05 2023 16:26:15.233265 - AppInfoManager.ProcessConnected BEGIN 88060 /home/pad/Документы/OpenXR-SDK-Source-main/build/src/tests/hello_xr/hello_xr 9 system.generated.openxr.helloxr.hello_xr
Wed Apr 05 2023 16:26:15.233291 - SetApplicationPid appkey=system.generated.openxr.helloxr.hello_xr pid=88060, external transition
Wed Apr 05 2023 16:26:15.233308 - SetApplicationPid: Setting app system.generated.openxr.helloxr.hello_xr PID to 88060
Wed Apr 05 2023 16:26:15.233389 - AppInfoManager.ProcessConnected END
Ozzadar commented 1 year ago

Just thought I'd come in here and share that I'm getting the same error as @pavelk0508, it's quite annoying. I can't even seem to get the application to close nicely by yeeting it into a thread and detaching.

These are the debug logs from my application:

[2023-05-20 17:54:45.897] [trace] Destroyed OpenXR Swapchain.
[2023-05-20 17:54:45.897] [trace] Destroyed OpenXR Application Space.
Sat May 20 2023 17:54:45.901626 - Changing app type from VRApplication_OpenXRScene to VRApplication_OpenXRInstance
Sat May 20 2023 17:54:45.901718 - Received message of type 1107 when 121 was expected on pipe VR_ServerPipe_18797
Sat May 20 2023 17:54:45.901750 - IPC failure changing app type
Sat May 20 2023 17:54:45.901774 - Unexpected error while trying to change the OpenXR application type: [VRInitError_Init_Internal]
[2023-05-20 17:54:45.901] [trace] Destroyed OpenXR Session.
[2023-05-20 17:54:45.901] [trace] Destroyed VMA Allocator.
[2023-05-20 17:54:45.951] [trace] Destroyed Vulkan Device.
[2023-05-20 17:54:45.953] [trace] Destroying Vulkan Debug Messenger.
[2023-05-20 17:54:46.024] [trace] Destroyed Vulkan Instance.
[2023-05-20 17:54:46.024] [trace] Destroying OpenXR Instance.
Kagukara commented 11 months ago

Getting the same problem running xr-video-player via SteamVR.

To close xr-video-player --video ~/videos/VR/playlist.m3u when pressing either "Esc" on the window or "ctrl+c" it shows Requesting exit.. in the terminal and the window is closed, but still active in the terminal until its killed inside btop. Once killed in btop it outputs Killed in the terminal and is no longer active.

More on the issue here: https://codeberg.org/yoshino/xr-video-player/issues/3

kwahoo2 commented 11 months ago

The issue still exists in the SteamVR 2.0.2 beta.

JoeRosselli commented 3 weeks ago

Reporting in 2024, SteamVR/OpenXR application on Linux hangs forever on destruction and requires a kill to exit.

Calling thread hangs forever when calling xrDestroyInstance. Stack trace where it's stuck:

[libc.so.6] futex_wait 0x00007ffff529af35 [libc.so.6] futex_wait_simple 0x00007ffff529af35 [libc.so.6] GI_pthread_cond_destroy 0x00007ffff529af35 [vrclient.so] CThreadSyncObject::~CThreadSyncObject() [clone .part.0] 0x00007fffbfa5b839 [vrclient.so] CHmdSystemLatest::~CHmdSystemLatest() 0x00007fffbf9b23bc [libc.so.6] cxa_finalize 0x00007ffff5247372 [vrclient.so] do_global_dtors_aux 0x00007fffbf8d5b03 [APP_NAME] 0x00007fffc49fed50 [ld-linux-x86-64.so.2] _dl_call_fini 0x00007ffff7fc60f2

Ubuntu 24.04, Vulkan 1.2.X, SteamVR build id 15141130, SteamVR runtime version 2.7.4

[Edit] Have verified that the same exact codebase on Windows does not hang forever on exit and does exit gracefully.