ValveSoftware / SteamVR-for-Linux

Issue tracker for the Linux port of SteamVR
916 stars 45 forks source link

[BUG] vrcompositor double vision identified, stutter identified #633

Open SpookySkeletons opened 10 months ago

SpookySkeletons commented 10 months ago

I believe I've discovered the source of the SteamVR Vulkan specific compositor woes on linux using the Monado source code and a little induction.

The persistent vision splitting that pops up at lower framerates appears to be the result of steam using Vulkan's GRAPHICS queues instead of COMPUTE queues for the reprojection shaders. I don't understand why this is the case unique to the Vulkan API as say opposed to the directX compositor paths but I can reproduce.

When monado is run with https://gitlab.freedesktop.org/monado/monado/-/blob/main/src/xrt/compositor/util/comp_vulkan.c#L274 modified to false and the enviornment variable to activate the shader compositor I get eerily similar behavior to Steam's half rate and lower framerate behavior where the vision starts to split with head rotation speed. Changed back to COMPUTE it actually offers a pretty spotless experience, until the GPU core is saturated, it then begins to stall out and is no longer able to meet the VK_QUEUE_GLOBAL_PRIORITY_REALTIME_EXT and QUEUE_GLOBAL_PRIORITY_REALTIME demands.

I was able to force a CPU bottleneck by turning on the proton openvr logging frameworks to ensure it was almost impossible to saturate the GPU on monado with opencomposite for a session of VRChat on a 60 pop room in front of a full quality mirror with frame lows of 10, the compositor never suffered a missed frame and the view projection was as smooth as the timewarp could offer.

I removed the bottleneck by turning off the logging framework and boom, we're back to Monado unique jittery stutter at even modest population of 40 in the room.

The vulkan drivers' GRAPHICS queues are not a good fit, if these are in use. COMPUTE queues seem to offer a much better experience so long as the GPU is not put to full load, but I am hoping the underlying issue of GPU load tamping REALTIME queues can then be identified and addressed separately (mesa?) on top of this change to iron out all the steamvr vrcompositor issues fully.

Slight TL;DR edit: The GRAPHICS queue is FUBAR for realtime operations. The COMPUTE queue is usable but stalls under full device load.

Edit 2: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10101 Upstream issue

NewtSoup commented 10 months ago

Oh you star! I really hope this is the cause. I've had most VR games unplayable since late August because of this.

Joshua-Ashton commented 10 months ago

What you are saying here isn't true. SteamVR uses realtime async compute, not graphics queues for async reprojection when available.

SpookySkeletons commented 10 months ago

@Joshua-Ashton An ALVR dev pulled the debug layer so the CAP_SYS_NICE must not have been set. It means that we're suffering the exact same issue of the realtime compute shader being scheduled late.

I'm experimenting with some of the amdgpu kernel parameters around their new GPU scheduler work and see if it can better meet the deadlines tonight.

NewtSoup commented 10 months ago

@SpookySkeletons do keep us updated. With MS pulling Win10 support in the near future it's highly likely there will be an upturn in users wanting to try Linux as a gaming platform and some of those may well be wanting to use VR. Those with older midrange machines are not going to want to buy new hardware just to use Win 11. I'm on 7th Gen Intel and even though it has TMP2.0 it's not Win11 compatible ( not that I have any intention of instaling it anyway )

SpookySkeletons commented 10 months ago

No luck messing with recent kernel flags.

Relevant mailing list. https://lists.freedesktop.org/archives/amd-gfx/2016-December/004122.html

"The only risk is the situation when graphics will take all needed CUs. But in any case it should be very good test."

DRM scheduling for realtime compute shaders appears to have stopped here. The moment the CUs are fully saturated, no other task can be performed. Mesa devs suggest hardware/ firmware cooperation of the gpu driver is necessary to preempt these situations as the only workable solution.

Vixea commented 7 months ago

It's nice to see someone trying to get ALVR happy too. I've asked Nowrep as he's a little more informed about graphics than I am... no luck.