ValveSoftware / gamescope

SteamOS session compositing window manager
Other
3.16k stars 214 forks source link

Hang on startup after bd722f786f5a74f8282d903bd5eb3d331c4cd920 on NV #948

Open misyltoad opened 1 year ago

misyltoad commented 1 year ago

Hangs on startup on NVIDIA after https://github.com/ValveSoftware/gamescope/commit/bd722f786f5a74f8282d903bd5eb3d331c4cd920

Note that this is not with any of the reshade stuff enabled.

Not sure what is bad about this commit. The log says its grabbing 2 as the async compute queue and 0 as the general one.

Seems like we hang in a CVulkanDevice::wait, despite not ever using the general queue anywhere at all. Seems like merely grabbing both queues here seems to break vkWaitSemaphores.

misyltoad commented 1 year ago

Nothing interesting in validation...

misyltoad commented 1 year ago

cc: @erik-kz given you have been looking at stuff recently.

cubanismo commented 1 year ago

I've filed internal bug 4330963 to track investigation within NVIDIA.

r3k2 commented 1 year ago

same issue here with wayland+wlroots+sway+nvidia on Arch GNU/Linu

 gamescope 
wlserver: [backend/headless/backend.c:68] Creating headless backend
vulkan: physical device 10de:1b06 compute queue doesn't support presenting on our surface, using graphics queue
vulkan: selecting physical device 'NVIDIA GeForce GTX 1080 Ti': queue family 0 (general queue family 0)
vulkan: physical device supports DRM format modifiers
Segmentation fault

this was working fine before.

HWG90 commented 12 months ago

Issue continues in 3.13.8

sharkautarch commented 11 months ago

Issue continues in 3.13.8

@HWG90 I've just updated my nvidia-fix branch, which has a workaround for the current nvidia driver bug w/ gamescope, to the latest git revision of gamescope. I've tested my building and running gamescope from my branch w/ nvidia, and it seems to work as long as you set ENABLE_GAMESCOPE_WSI=0 (e.i. ENABLE_GAMESCOPE_WSI=0 gamescope <params> -- <game>) if you're using hybrid graphics, you'll have to run gamescope like so: ENABLE_GAMESCOPE_WSI=0 gamescope --prefer-vk-device $(env MESA_VK_DEVICE_SELECT=list vulkaninfo) <params> -- <game>

to build from my branch:

git clone https://github.com/sharkautarch/gamescope.git
cd gamescope
git checkout nvidia-fix
meson setup -Dc_args="-Wno-uninitialized -Wno-maybe-uninitialized" -Dcpp_args="-Wno-uninitialized -Wno-maybe-uninitialized" --reconfigure build
ninja -C build
meson install -C build
ReillyBrogan commented 11 months ago

I would note that that revert breaks gamescope on AMD cards at least, and likely Intel ones as well. In Solus we handled this by building 3.13.8 twice, once with the Nvidia revert applied and once without. The Nvidia revert build is installed at /usr/bin/gamescope-nvidia while the unpatched build is at /usr/bin/gamescope. Just a though on how to handle this if there are any other distribution maintainers watching this thread.

akay commented 11 months ago

to build from my branch:

Thank you for providing this. vkcube now starts, but it stops after a few seconds, then after exactly 30 seconds later, I get the following spammed forever in the terminal until I SIGKILL gamescope-wl:

vblankmanager: write failed: Resource temporarily unavailable

ElecTwix commented 11 months ago

to build from my branch:

Thank you for providing this. vkcube now starts, but it stops after a few seconds, then after exactly 30 seconds later, I get the following spammed forever in the terminal until I SIGKILL gamescope-wl:

vblankmanager: write failed: Resource temporarily unavailable

Same behavior here too.


OS: Arch Linux x86_64
NV Driver Version: 535.129.03
sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1826894410

Try running it with: ENABLE_GAMESCOPE_WSI=0 gamescope --prefer-vk-device “10de:$(vulkaninfo -o /dev/stdout | grep -i 10de -A1 | tail -n 1 | tr ‘x’ ‘ ‘ | awk ‘{print $4}’)” <params> — <game>

You could also try building from my nvidia-fix-and-vblank-debug-extra-experimental-v4.5 idk if said branch would work if nvidia-fix branch doesn’t for you

akay commented 11 months ago

Try running it with:

I have tried using the --prefer-vk-device flag before, sadly it still doesn't work.

nvidia-fix-and-vblank-debug-extra-experimental-v4.5

The new branch doesn't seem to work either: ``` $ ENABLE_GAMESCOPE_WSI=0 gamescope -- vkcube wlserver: [backend/headless/backend.c:67] Creating headless backend vulkan: physical device 10de:1f07 compute queue doesn't support presenting on our surface, using graphics queue vulkan: selecting physical device 'NVIDIA GeForce RTX 2070': queue family 0 vulkan: physical device supports DRM format modifiers vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x3231564E (VkResult: 0) vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x38344241 (VkResult: 0) vulkan: vkGetPhysicalDeviceFormatProperties2 returned zero modifiers for DRM format 0x38344258 (VkResult: 0) vulkan: supported DRM formats for sampling usage: vulkan: AR24 (0x34325241) vulkan: XR24 (0x34325258) vulkan: AB24 (0x34324241) vulkan: XB24 (0x34324258) vulkan: RG16 (0x36314752) vulkan: AB4H (0x48344241) vulkan: XB4H (0x48344258) vulkan: AB30 (0x30334241) vulkan: XB30 (0x30334258) vulkan: AR30 (0x30335241) vulkan: XR30 (0x30335258) vulkan: Creating Gamescope nested swapchain with format 64 and colorspace 0 wlserver: Running compositor on wayland display 'gamescope-0' wlserver: [backend/headless/backend.c:17] Starting headless backend wlserver: [util/env.c:9] Loading WLR_NO_HARDWARE_CURSORS option: 1 wlserver: [types/output/output.c:435] WLR_NO_HARDWARE_CURSORS set, forcing software cursors wlserver: [xwayland/server.c:108] Starting Xwayland on :1 The XKEYBOARD keymap compiler (xkbcomp) reports: ... Errors from xkbcomp are not fatal to the X server wlserver: [types/wlr_compositor.c:692] New wlr_surface 0x5638ff3fe510 (res 0x5638ff3d93a0) wlserver: [xwayland/server.c:273] Xserver is ready pipewire: stream state changed: connecting pipewire: stream state changed: paused pipewire: stream available on node ID: 70 x86_64 processor: Brand: GenuineIntel Model: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz Invariant TSC: True cpuid leaf 15H does not give frequency From model name string frequency 3.60 GHz => 277.78 ps Sanity check against std::chrono::steady_clock gives frequency 3.60 GHz => 277.78 ps Measured granularity = 22 ticks => 163.64 MHz, 6.11 ns nsPerTick: 0.000277778 delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 8.22667e+06 ratio= 1 rollingMaxDrawTime after using fmin: 8333333 rollingMaxDrawTime after using std::clamp: 8333333 offset: 9983333 sleep_cycle: vblank cycle time before first sleep: 0.023248ms rollingMaxDrawTime after using std::clamp: 8226666 offset: 9876666 sleep_cycle: vblank cycle time before second wait: 7.26843ms vblank cycle time before write(): 11.0171ms vblank cycle time after write(): 11.0278ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 10.02ms vulkan: Creating Gamescope nested swapchain with format 64 and colorspace 0 pipewire: renegotiating stream params (size: 1280x720) dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.01ms Selected GPU 0: NVIDIA GeForce RTX 2070, type: DiscreteGpu dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.01ms wlserver: [types/wlr_compositor.c:692] New wlr_surface 0x5638ff3aeaf0 (res 0x5638ff3dc9b0) xwm: got the same buffer committed twice, ignoring. The XKEYBOARD keymap compiler (xkbcomp) reports: ... Errors from xkbcomp are not fatal to the X server dispatch_vblank(int): VBlankTimeInfo_t receive latency: 4.65ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 7.46ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.53ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.24ms vulkan: Creating Gamescope nested swapchain with format 64 and colorspace 0 pipewire: renegotiating stream params (size: 852x1438) dispatch_vblank(int): VBlankTimeInfo_t receive latency: 9.71ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 2.78ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.01ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.01ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.05ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.03ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.01ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.08ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.12ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.09ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.17ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.05ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.05ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.05ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.01ms dispatch_vblank(int): VBlankTimeInfo_t receive latency: 0.04ms post-vblank TPAUSE wait loop duration: 1.03451ms total vblank period: 6.96097ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.81189e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4861110 rollingMaxDrawTime after using std::clamp: 4861110 offset: 6511110 sleep_cycle: vblank cycle time before first sleep: 0.011202ms rollingMaxDrawTime after using std::clamp: 4779762 offset: 6429762 sleep_cycle: vblank cycle time before second wait: 3.66527ms vblank cycle time before write(): 5.91367ms vblank cycle time after write(): 5.91687ms post-vblank TPAUSE wait loop duration: 1.04421ms total vblank period: 6.9646ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.59212e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4636862 rollingMaxDrawTime after using std::clamp: 4636862 offset: 6286862 sleep_cycle: vblank cycle time before first sleep: 0.013613ms rollingMaxDrawTime after using std::clamp: 4559999 offset: 6209999 sleep_cycle: vblank cycle time before second wait: 3.6209ms vblank cycle time before write(): 5.9205ms vblank cycle time after write(): 5.92418ms 858 vblanks sent in 6 seconds post-vblank TPAUSE wait loop duration: 1.01569ms total vblank period: 6.95467ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58548e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4630081 rollingMaxDrawTime after using std::clamp: 4630081 offset: 6280081 sleep_cycle: vblank cycle time before first sleep: 0.018222ms rollingMaxDrawTime after using std::clamp: 4553354 offset: 6203354 sleep_cycle: vblank cycle time before second wait: 3.77729ms vblank cycle time before write(): 5.97525ms vblank cycle time after write(): 5.99444ms post-vblank TPAUSE wait loop duration: 1.03709ms total vblank period: 6.96242ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58505e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629643 rollingMaxDrawTime after using std::clamp: 4629643 offset: 6279643 sleep_cycle: vblank cycle time before first sleep: 0.015838ms rollingMaxDrawTime after using std::clamp: 4552925 offset: 6202925 sleep_cycle: vblank cycle time before second wait: 3.74267ms vblank cycle time before write(): 5.92019ms vblank cycle time after write(): 5.92566ms post-vblank TPAUSE wait loop duration: 0.746925ms total vblank period: 6.9676ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.020437ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.66219ms vblank cycle time before write(): 6.15487ms vblank cycle time after write(): 6.16074ms 863 vblanks sent in 6 seconds post-vblank TPAUSE wait loop duration: 1.00171ms total vblank period: 6.97308ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.042852ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.72778ms vblank cycle time before write(): 5.94926ms vblank cycle time after write(): 5.97025ms post-vblank TPAUSE wait loop duration: 0.674294ms total vblank period: 6.9847ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.062297ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.76651ms vblank cycle time before write(): 6.32144ms vblank cycle time after write(): 6.33826ms post-vblank TPAUSE wait loop duration: 1.04218ms total vblank period: 6.96806ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.016151ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.8679ms vblank cycle time before write(): 5.92928ms vblank cycle time after write(): 5.93528ms 864 vblanks sent in 6 seconds post-vblank TPAUSE wait loop duration: 1.02804ms total vblank period: 6.95329ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.011563ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.73395ms vblank cycle time before write(): 5.9197ms vblank cycle time after write(): 5.92246ms post-vblank TPAUSE wait loop duration: 1.02767ms total vblank period: 6.99021ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.054623ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.76261ms vblank cycle time before write(): 5.95865ms vblank cycle time after write(): 5.97708ms post-vblank TPAUSE wait loop duration: 0.686419ms total vblank period: 6.96335ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.01442ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.65442ms vblank cycle time before write(): 6.27737ms vblank cycle time after write(): 6.28156ms 864 vblanks sent in 6 seconds post-vblank TPAUSE wait loop duration: 0.996861ms total vblank period: 6.97343ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.061603ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.93933ms vblank cycle time before write(): 5.97549ms vblank cycle time after write(): 5.99356ms post-vblank TPAUSE wait loop duration: 1.0365ms total vblank period: 6.97172ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.024603ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.7064ms vblank cycle time before write(): 5.93954ms vblank cycle time after write(): 5.95134ms post-vblank TPAUSE wait loop duration: 1.02751ms total vblank period: 6.99131ms delta= 1 (double) ( ( alpha * rollingMaxDrawTime ) + ( range - alpha ) * drawTime ) / (range))* ratio /( delta): 4.58504e+06 ratio= 1 rollingMaxDrawTime after using fmin: 4629629 rollingMaxDrawTime after using std::clamp: 4629629 offset: 6279629 sleep_cycle: vblank cycle time before first sleep: 0.066148ms rollingMaxDrawTime after using std::clamp: 4552911 offset: 6202911 sleep_cycle: vblank cycle time before second wait: 3.94621ms vblank cycle time before write(): 5.95736ms vblank cycle time after write(): 5.97662ms vblankmanager: write failed: Resource temporarily unavailable vblankmanager: write failed: Resource temporarily unavailable ... (EE) failed to read Wayland events: Broken pipe ```

The last message is me running killall -sKILL gamescope-wl in another terminal.

The three dots are irrelevant/repeating messages (xkbcomp etc).

sharkautarch commented 11 months ago

Try running it with:

I have tried using the --prefer-vk-device flag before, sadly it still doesn't work.

nvidia-fix-and-vblank-debug-extra-experimental-v4.5

The new branch doesn't seem to work either.

Huh, strange… Can you provide the terminal output of gamescope (up until it just repeatedly outputs vblankmanager: write failed: Resource temporarily unavailable)

Also are you also using 535 nvidia drivers? (Those are what I’m using right now, tho I’m currently specifically on the 6.6 Linux kernel w/ the 535 dkms nvidia drivers)

Hmmm I feel like I’ve seen that vblankmanager: write failed: Resource temporarily unavailable message before, but I don’t remember the context for it…

I have tried using the --prefer-vk-device flag before, sadly it still doesn't work.

Did you also try running it with --prefer-vk-device “10de:$(vulkaninfo -o /dev/stdout | grep -i 10de -A1 | tail -n 1 | tr ‘x’ ‘ ‘ | awk ‘{print $4}’)”

Because I think that above may work whereas —prefer-vk-device $(env MESA_VK_DEVICE_SELECT=list vulkaninfo) doesn’t

akay commented 11 months ago

I've updated my last comment, and the rest:

$ pacman -Q nvidia-dkms
nvidia-dkms 545.29.06-1

$ uname -a
Linux artsql 6.6.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 20 Nov 2023 23:18:21 +0000 x86_64 GNU/Linux

$ cat /proc/cmdline
... nvidia-drm.modeset=1 loglevel=3 sysrq_always_enabled=1 udev.log_level=3 bgrt_disable
sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1826968053

hmmm well I’m using nvidia-dkms 535 Maybe the ‘nvidia-fix’ branch doesn’t work with nvidia 545 drivers…? I’ve had to stay on the nvidia-dkms 535 drivers, because 545 didn’t seem to work well on my hybrid graphics laptop so idk

akay commented 11 months ago

Did you also try running it with --prefer-vk-device “10de:$(vulkaninfo -o /dev/stdout | grep -i 10de -A1 | tail -n 1 | tr ‘x’ ‘ ‘ | awk ‘{print $4}’)”

That just outputs 1f07 which is what I tried before.

Small nitpick: Please make sure to convert your backticks and backquotes to ticks and quotes when asking to run commands :)

akay commented 11 months ago

hmmm well I’m using nvidia-dkms 535

Okay, it works using 535:

$ pacman -Q nvidia-dkms
nvidia-dkms 535.113.01-2

$ ENABLE_GAMESCOPE_WSI=0 gamescope -w 2560 -h 1440 -e -- steam

So the problem is with how the 545 series of drivers handle things.

Can confirm that both your branches (nvidia-fix and nvidia-fix-and-vblank-debug-extra-experimental-v4.5) work.

Not perfectly, but I can actually load Half-Life 2 and a few other games, using GE_Proton8_24 where needed.

Guess we'll just have to wait for @cubanismo to hopefully fix it internally :)

sharkautarch commented 11 months ago

good to hear that it works for you on 535

Can confirm that both your branches (nvidia-fix and nvidia-fix-and-vblank-debug-extra-experimental-v4.5) work.

Not perfectly, but I can actually load Half-Life 2 and a few other games, using GE_Proton8_24 where needed.

@akay try my newer nvidia-fix-and-vblank-debug-extra-experimental-v4.6 branch

SeongGino commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1832666734

Using 545, it does seem to work with both OGL and Vulkan (Wine/DXVK) games.

sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1832749254

Huh, maybe there was a really new update to 545 that unbroke whatever was causing an issue? idk

Can you run nvidia-smi after running gamescope to confirm that gamescope is using the nvidia gpu w/ your 545 driver? (Just checking that it isn't just using a different gpu in the case that your system has another non-nvidia gpu)

SeongGino commented 11 months ago

Can you run nvidia-smi after running gamescope to confirm that gamescope is using the nvidia gpu w/ your 545 driver? (Just checking that it isn't just using a different gpu in the case that your system has another non-nvidia gpu)

If this is a reply to me, it's a desktop with only a 3060ti for graphics (no APU or iGPU at all).

Granted, just to cover my bases but on a slightly unrelated note, I have to run gamescope with __GL_THREADED_OPTIMIZATIONS=0 as per https://github.com/ValveSoftware/gamescope/issues/526#issuecomment-1744609920 - otherwise the error in that thread occurs for me. But this applies to "normal" gamescope (pre-present day issue here) as well.

akay commented 11 months ago

try my newer nvidia-fix-and-vblank-debug-extra-experimental-v4.6 branch

Things are more or less OK with 535 and nvidia-fix, you want me to try this branch with 545?

sharkautarch commented 11 months ago

try my newer nvidia-fix-and-vblank-debug-extra-experimental-v4.6 branch

Things are more or less OK with 535 and nvidia-fix, you want me to try this branch with 545?

no just try the nvidia-fix-and-vblank-debug-extra-experimental-v4.6 branch w/ nvidia 535 the v4.6 might just feel a bit smoother for you (tho it might not be noticable) (compared to the v4.5 branch)

akay commented 11 months ago

Sure, it feels OK :) Thank you!

jkozera commented 11 months ago

@sharkautarch I'm just testing nvidia-fix-and-vblank-debug-extra-experimental-v4.7 because I also encounter vblankmanager: write failed: Resource temporarily unavailable errors.

I got an STL exception from std::clamp with !(__hi < __lo), so I've replaced the debugging calls with a custom safe_clamp:

template<class T>
constexpr const T& safe_clamp(const T& v, const T& lo, const T& hi)
{
    if (lo > hi) {
           std::cout << "lo > hi" << lo << " " << hi << "; returning lo\n";
           return lo;
    }
    return std::clamp(v, lo, hi);
}

And now I get many of these

lo > hi16666666 7638888; returning lo
lo > hi18316666 8333332; returning lo
lo > hi16414964 7638888; returning lo
lo > hi18064964 7291666; returning lo

which are probably not very helpful because I don't see which call these come from, but I just thought you might be interested in knowing about this STL exception.

sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1837564623

Do you know which gamescope cpp file is generating these exceptions from? (or is calling whatever STL function/method that then produces these exceptions?)
If you don't, you should be able to figure it out by:

jkozera commented 11 months ago

Do you know which gamescope cpp file is generating these exceptions from?

It's definitely vblankmanager.cpp changed in the nvidia-fix-and-vblank-debug-extra-experimental-v4.7 branch, because after I've replaced the std::clamp calls in that file, the exceptions no longer happen.

sharkautarch commented 11 months ago

Do you know which gamescope cpp file is generating these exceptions from?

It's definitely vblankmanager.cpp changed in the nvidia-fix-and-vblank-debug-extra-experimental-v4.7 branch, because after I've replaced the std::clamp calls in that file, the exceptions no longer happen.

Ok, can you still try to get the core dump from the exceptions, and then analyze it with gdb as I showed You might be able to see the specific line in vblankmanager.cpp that had the std::clamp which caused the exception

jkozera commented 11 months ago
#4  0x00007f9c5a8dd3b2 in std::__glibcxx_assert_fail(char const*, int, char const*, char const*)
    (file=file@entry=0x55e2052b2008 "/usr/include/c++/13.2.1/bits/stl_algo.h", line=line@entry=3669, function=function@entry=0x55e2052b1fa8 "constexpr const _Tp& std::clamp(const _Tp&, const _Tp&, const _Tp&) [with _Tp = long int]", condition=condition@entry=0x55e2052c1adb "!(__hi < __lo)") at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:61
#5  0x000055e20518f7e2 in std::clamp<long>(long const&, long const&, long const&) (__val=<optimized out>, __lo=<optimized out>, __hi=<optimized out>) at /usr/include/c++/13.2.1/bits/stl_algo.h:3667
#6  std::clamp<long>(long const&, long const&, long const&) (__hi=<optimized out>, __lo=<optimized out>, __val=<optimized out>) at /usr/include/c++/13.2.1/bits/stl_algo.h:3667
#7  vblankThreadRun(bool, bool, long, long double) (neverBusyWait=<optimized out>, alwaysBusyWait=<optimized out>, cpu_pause_time_len=<optimized out>, nsPerTick_long=<optimized out>)
    at ../src/vblankmanager.cpp:773

so it's this one:

rollingMaxDrawTime = (uint64_t)std::clamp(centered_mean/2, (long int) rollingMaxDrawTime, nsecInterval + nsecInterval/10);
sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1837594631

I just realized that I ordered the parameters to std::clamp wrong lol

I just fixed all of the std::clamp and pushed out the fix to the nvidia-fix-and-vblank-debug-extra-experimental-v4.7 branch

Thanks for spotting that!

also let me know if there are still any issues with nvidia-fix-and-vblank-debug-extra-experimental-v4.7

jkozera commented 11 months ago

Now it seems better, because it runs for a bit longer time, but it still crashes, in the same line. Maybe it's because of larger latency occuring? Because it crashes shortly after VBlankTimeInfo_t receive latency: 2.42ms while previous latencies were on order of 0.01ms.

sharkautarch commented 11 months ago

Now it seems better, because it runs for a bit longer time, but it still crashes, in the same line. Maybe it's because of larger latency occuring? Because it crashes shortly after VBlankTimeInfo_t receive latency: 2.42ms while previous latencies were on order of 0.01ms.

ok I've just pushed out another fix, hopefully it at least doesn't crash on the same exact line in vblankmanager.cpp again...

jkozera commented 11 months ago

It doesn't crash anymore, but sadly it still hangs with "vblankmanager: write failed: Resource temporarily unavailable". The "receive latency" messages stop appearing, then the game hangs, then after a few (10-30) seconds the "vblankmanager: write failed: Resource temporarily unavailable" start appearing repeatedly in terminal.

sharkautarch commented 11 months ago

It doesn't crash anymore, but sadly it still hangs with "vblankmanager: write failed: Resource temporarily unavailable". The "receive latency" messages stop appearing, then the game hangs, then after a few (10-30) seconds the "vblankmanager: write failed: Resource temporarily unavailable" start appearing repeatedly in terminal.

What nvidia driver are you using? Some people have said that the problem goes away after switching from nvidia 545 to nvidia dkms 535 I myself am using the nvidia dkms 535 driver

jkozera commented 11 months ago

I use 545. I haven't tried switching back to an older one yet, it's on my TODO list. :)

ageisen2000 commented 11 months ago

Replying to #948 (comment)

I just realized that I ordered the parameters to std::clamp wrong lol

I just fixed all of the std::clamp and pushed out the fix to the nvidia-fix-and-vblank-debug-extra-experimental-v4.7 branch

Thanks for spotting that!

also let me know if there are still any issues with nvidia-fix-and-vblank-debug-extra-experimental-v4.7

Can you tell me how to fix this error when building from your branch @sharkautarch? I have a 3090 and would like to try this fix with the newest nvidia drivers src/meson.build:95:18: ERROR: Include dir reshade/source does not exist.

I'm running this command:

meson setup -Dc_args="-Wno-uninitialized -Wno-maybe-uninitialized" -Dcpp_args="-Wno-uninitialized -Wno-maybe-uninitialized" --reconfigure build
sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1842093063

I believe you just need to also run git submodule update --init --recursive and then run ninja -C build again

ageisen2000 commented 11 months ago

@sharkautarch, thank you!! The build worked. ~This is probably a dumb question.... but where do I get the binary that was built? How do run the gamescope command with it?~

Figured it out, missed the meson install -C build step

ageisen2000 commented 11 months ago

@sharkautarch, do you know if this patch would cause artifacting issues on Nvidia cards? I can't launch my game without a lot of artifacting / glitchy graphics.

sharkautarch commented 11 months ago

@sharkautarch, do you know if this patch would cause artifacting issues on Nvidia cards? I can't launch my game without a lot of artifacting / glitchy graphics.

For which? nvidia-fix or nvidia-fix-and-vblank-debug-extra-experimental-v4.7?

Well for either case, try my new branch: nvidia-fix-and-vblank-debug-extra-experimental-v4.8

ageisen2000 commented 11 months ago

I was using the 4.7 patch. I'll try 4.8 tonight and let you know how it goes

On Fri, Dec 8, 2023, 5:48 PM sharkautarch @.***> wrote:

@sharkautarch https://github.com/sharkautarch, do you know if this patch would cause artifacting issues on Nvidia cards? I can't launch my game without a lot of artifacting / glitchy graphics.

For which? nvidia-fix or nvidia-fix-and-vblank-debug-extra-experimental-v4.7?

Well for either case, try my new branch: nvidia-fix-and-vblank-debug-extra-experimental-v4.8

— Reply to this email directly, view it on GitHub https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1847981089, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDPMWHM6QLUCFUFVVDBSBTYIOROLAVCNFSM6AAAAAA4YQKFR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBXHE4DCMBYHE . You are receiving this because you commented.Message ID: @.***>

brain-anti-freeze commented 11 months ago

The vulkan beta driver released yesterday claims to Fix Gamescope hang on startup [Linux] but the current master still hangs for me (or gives no video output anyway, no errors from gamescope) in embedded mode. I didn't try nested.

ageisen2000 commented 11 months ago

@sharkautarch , the 4.8 branch seems to have crashed my computer when I ran it from a TTY. I couldn't swap TTYs and I didn't see any logs or anything. Command I ran: gamescope -W 3440 -H 1440 --immediate-flips -O DP-1 -e -- steam

I'm considering just giving up on Nvidia support for this for now, but if you need help debugging or anything shoot me a message and maybe we could work together over discord or something

detiam commented 11 months ago

I'm on driver 545, and with branch nvidia-fix-and-vblank-debug-extra-experimental-v4.8 I can run gamescope on nested mode (under X) perfectly, on embedded mode (under tty) without option --adaptive-sync it works fine, with this option gamescope will freeze, but program still running, and my monitor shows it only have 48hz refresh rate. log.txt

EDIT: oh actually if only have --immediate-flips set, on my end everything fine, but log above have

drm: Immediate flips are not supported by the KMS driver

so I guess on my end --immediate-flips dose nothing.

sharkautarch commented 11 months ago

I'm not surprised that gamescope still crashing with certain options with nvidia.. I might try compiling gamescope w/ clang sanitizers and vulkan validation and see if I find anything that pops up when using --immediate-flips or --adaptive-sync

EDIT: does anything change if you add the environment variable ENABLE_GAMESCOPE_WSI=0 gamescope --immediate-flips ... ?

ageisen2000 commented 11 months ago

I'm not surprised that gamescope still crashing with certain options with nvidia.. I might try compiling gamescope w/ clang sanitizers and vulkan validation and see if I find anything that pops up when using --immediate-flips or --adaptive-sync

EDIT: does anything change if you add the environment variable ENABLE_GAMESCOPE_WSI=0 gamescope --immediate-flips ... ?

When I enable that environment variable it actually loads up.

Gets stuck on this glitchy screen though PXL_20231210_011000420 MP

sharkautarch commented 11 months ago

maybe try running export VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation ENABLE_GAMESCOPE_WSI=0 gamescope --immediate-flips ... and see if it prints anything else useful (you might need to add > some_file.out at the end so that you'll be able to see what it printed out)

detiam commented 11 months ago

maybe try running export VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation ENABLE_GAMESCOPE_WSI=0 gamescope --immediate-flips ... and see if it prints anything else useful (you might need to add > some_file.out at the end so that you'll be able to see what it printed out)

log.txt with command

VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation ENABLE_GAMESCOPE_WSI=0 gamescope --adaptive-sync -- vkcube > log.txt 2>&1

it will first load up, show vkcube spin for one second then freeze, monitor shows 48 refresh rate, when I press esc vkcube will exit and gamescope exit.

ageisen2000 commented 11 months ago

I ran

sudo setcap 'CAP_SYS_NICE=eip' /usr/local/bin/gamescope
VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation ENABLE_GAMESCOPE_WSI=0 gamescope -W 3440 -H 1440 -r 175 --immediate-flips -O DP-2 -- glxgears > glxgearsout.log

And got a bunch of garbage on the screen. I could see the app, but weirdly enough even in TTY mode I saw a steam window. Idk whats going on there.

glxgearsout.log

sharkautarch commented 11 months ago

Replying to https://github.com/ValveSoftware/gamescope/issues/948#issuecomment-1848875180

I think there's also an issue with running gamescope w/ nvidia after setting 'CAP_SYS_NICE=eip' tho I think I might be able to find a workaround to that specifically

EDIT: I just found a workaround to the issue I've experienced with running gamescope w/ nvidia after setting 'CAP_SYS_NICE=eip' ... Since you haven't mentioned issues with 'CAP_SYS_NICE=eip', maybe it works fine for you since you are using the nvidia 545 driver, whereas I'm currently using the nvidia 535 dkms driver...

For --immediate-flips, I'm guessing that it probably won't be able to work, or that is, even if we could find a way to prevent gamescope on nvidia from crashing when using it, it'll probably just do nothing. Just gotta hope that it'll be patched in an nvidia driver update... fingers crossed

If you want to try to reduce latency, try running gamescope w/ MESA_VK_WSI_PRESENT_MODE=mailbox ENABLE_GAMESCOPE_WSI=0 gamescope ... or MESA_VK_WSI_PRESENT_MODE=mailbox ENABLE_GAMESCOPE_WSI=0 gamescope ... -- env MESA_VK_WSI_PRESENT_MODE=immediate ...

unhappy-ending commented 11 months ago

Hello, just adding some feedback. I could run nvidia+gamescope on anything that was native wayland apps just fine. Problem was anything using Xwayland would give me a black screen. I installed the 535.43.20 vulkan beta driver @brain-anti-freeze posted about earlier and it fixed gamescope for me. I'm not using the @sharkautarch nvidia-fix patch, just vanilla gamescope from the Gentoo repo.