getsolus / packages

Solus Package Monorepo & Issue Tracker
62 stars 78 forks source link

Vulkan app segfaults/coredumps with NVIDIA GPU on GNOME Wayland #3687

Open Staudey opened 3 weeks ago

Staudey commented 3 weeks ago

Please confirm there isn't an existing open bug report

Summary

I've recently started experiencing somewhat random segfaults with Vulkan apps in GNOME Wayland sessions with my NVIDIA GPU. Back when I updated egl-wayland to 1.1.15 everything worked correctly, then a week or so later suddenly vkcube (and other Vulkan apps) no longer launched. Rolling back the updates to the previously working state didn't solve the issue. Then it randomly started working again for a few sessions, after which it again started segfaulting. It happens on every driver package I've tried (I think I have tested them all but will do so again to make sure). OpenGL apps seem to work fine. (e.g. the game CoreKeeper doesn't launch in Vulkan mode, but launches fine with OpenGL). The free Steam game "b" shows below error on launch, then exits. An X11 GNOME (or Budgie) session doesn't show any of these issues.

vkcube coredump ``` PID: 4573 (vkcube) UID: 1000 (thomas) GID: 1000 (thomas) Signal: 11 (SEGV) Timestamp: Thu 2024-08-29 11:55:28 CEST (19min ago) Command Line: vkcube Executable: /usr/bin/vkcube Control Group: /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-219b802c-1b07-4c27-a4b0-29a53d8b3b82.scope Unit: user@1000.service User Unit: vte-spawn-219b802c-1b07-4c27-a4b0-29a53d8b3b82.scope Slice: user-1000.slice Owner UID: 1000 (thomas) Boot ID: abc89fef2d8d458da7a3ef655b273c61 Machine ID: 691497a94d724f0689192cbb5206dda8 Hostname: solus-pc Storage: /var/lib/systemd/coredump/core.vkcube.1000.abc89fef2d8d458da7a3ef655b273c61.4573.1724925328000000.zst (present) Size on Disk: 2.1M Message: Process 4573 (vkcube) of user 1000 dumped core. Stack trace of thread 4573: #0 0x00007f3993541f03 n/a (libnvidia-glcore.so.560.35.03 + 0xd41f03) #1 0x000056248efe1ad9 demo_prepare_buffers (vkcube + 0x7ad9) #2 0x000056248efdd8f3 demo_prepare (vkcube + 0x38f3) #3 0x00007f399f9eb2cc __libc_start_call_main (libc.so.6 + 0x2a2cc) #4 0x00007f399f9eb389 __libc_start_main_impl (libc.so.6 + 0x2a389) #5 0x000056248efdef65 _start (vkcube + 0x4f65) Stack trace of thread 4576: #0 0x00007f399fa59c5f __futex_abstimed_wait_common64 (libc.so.6 + 0x98c5f) #1 0x00007f399fa5cc35 __pthread_cond_wait_common (libc.so.6 + 0x9bc35) #2 0x00007f39931fc8dc n/a (libnvidia-glcore.so.560.35.03 + 0x9fc8dc) #3 0x00007f39936308e1 n/a (libnvidia-glcore.so.560.35.03 + 0xe308e1) #4 0x00007f39931fee04 n/a (libnvidia-glcore.so.560.35.03 + 0x9fee04) #5 0x00007f399fa5d9aa start_thread (libc.so.6 + 0x9c9aa) #6 0x00007f399fad940c __clone3 (libc.so.6 + 0x11840c) Stack trace of thread 4578: #0 0x00007f399fa59c5f __futex_abstimed_wait_common64 (libc.so.6 + 0x98c5f) #1 0x00007f399fa5c7ae __pthread_cond_wait_common (libc.so.6 + 0x9b7ae) #2 0x00007f39931fc87c n/a (libnvidia-glcore.so.560.35.03 + 0x9fc87c) #3 0x00007f399362af55 n/a (libnvidia-glcore.so.560.35.03 + 0xe2af55) #4 0x00007f39931fee04 n/a (libnvidia-glcore.so.560.35.03 + 0x9fee04) #5 0x00007f399fa5d9aa start_thread (libc.so.6 + 0x9c9aa) #6 0x00007f399fad940c __clone3 (libc.so.6 + 0x11840c) Stack trace of thread 4582: #0 0x00007f399fa59c5f __futex_abstimed_wait_common64 (libc.so.6 + 0x98c5f) #1 0x00007f399fa5cc35 __pthread_cond_wait_common (libc.so.6 + 0x9bc35) #2 0x00007f39931fc8dc n/a (libnvidia-glcore.so.560.35.03 + 0x9fc8dc) #3 0x00007f399373188c n/a (libnvidia-glcore.so.560.35.03 + 0xf3188c) #4 0x00007f399371e9b6 n/a (libnvidia-glcore.so.560.35.03 + 0xf1e9b6) #5 0x00007f39931fee04 n/a (libnvidia-glcore.so.560.35.03 + 0x9fee04) #6 0x00007f399fa5d9aa start_thread (libc.so.6 + 0x9c9aa) #7 0x00007f399fad940c __clone3 (libc.so.6 + 0x11840c) Stack trace of thread 4580: #0 0x00007f399fa59c5f __futex_abstimed_wait_common64 (libc.so.6 + 0x98c5f) #1 0x00007f399fa5cc35 __pthread_cond_wait_common (libc.so.6 + 0x9bc35) #2 0x00007f39931fc8dc n/a (libnvidia-glcore.so.560.35.03 + 0x9fc8dc) Spoiler warning #3 0x00007f399361656d n/a (libnvidia-glcore.so.560.35.03 + 0xe1656d) #4 0x00007f39931fee04 n/a (libnvidia-glcore.so.560.35.03 + 0x9fee04) #5 0x00007f399fa5d9aa start_thread (libc.so.6 + 0x9c9aa) #6 0x00007f399fad940c __clone3 (libc.so.6 + 0x11840c) Stack trace of thread 4577: #0 0x00007f399fa59c5f __futex_abstimed_wait_common64 (libc.so.6 + 0x98c5f) #1 0x00007f399fa5cc35 __pthread_cond_wait_common (libc.so.6 + 0x9bc35) #2 0x00007f39931fc8dc n/a (libnvidia-glcore.so.560.35.03 + 0x9fc8dc) #3 0x00007f39936402c7 n/a (libnvidia-glcore.so.560.35.03 + 0xe402c7) #4 0x00007f39931fee04 n/a (libnvidia-glcore.so.560.35.03 + 0x9fee04) #5 0x00007f399fa5d9aa start_thread (libc.so.6 + 0x9c9aa) #6 0x00007f399fad940c __clone3 (libc.so.6 + 0x11840c) ELF object binary architecture: AMD x86-64 ```
Core(Dump)Keeper Vulkan coredump ``` PID: 10218 (CoreKeeper) UID: 1000 (thomas) GID: 1000 (thomas) Signal: 11 (SEGV) Timestamp: Thu 2024-08-29 12:03:35 CEST (22min ago) Command Line: $'/home/thomas/.local/share/Steam/steamapps/common/Core Keeper/CoreKeeper' Executable: /home/thomas/.local/share/Steam/steamapps/common/Core Keeper/CoreKeeper Control Group: /user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-steam-9027.scope Unit: user@1000.service User Unit: app-gnome-steam-9027.scope Slice: user-1000.slice Owner UID: 1000 (thomas) Boot ID: abc89fef2d8d458da7a3ef655b273c61 Machine ID: 691497a94d724f0689192cbb5206dda8 Hostname: solus-pc Storage: /var/lib/systemd/coredump/core.CoreKeeper.1000.abc89fef2d8d458da7a3ef655b273c61.10218.1724925815000000.zst (present) Size on Disk: 48.2M Message: Process 10218 (CoreKeeper) of user 1000 dumped core. Module /home/thomas/.local/share/Steam/steamapps/common/Core Keeper/CoreKeeper_Data/MonoBleedingEdge/x86_64/libmono-native.so without build-id. Module /home/thomas/.local/share/Steam/steamapps/common/Core Keeper/CoreKeeper_Data/MonoBleedingEdge/x86_64/libmono-native.so Module /home/thomas/.local/share/Steam/steamapps/common/Core Keeper/CoreKeeper_Data/MonoBleedingEdge/x86_64/libmonobdwgc-2.0.so without build-id. Module /home/thomas/.local/share/Steam/steamapps/common/Core Keeper/CoreKeeper_Data/MonoBleedingEdge/x86_64/libmonobdwgc-2.0.so Stack trace of thread 10218: #0 0x00007fec16941f03 n/a (libnvidia-glcore.so.560.35.03 + 0xd41f03) #1 0x00007fec59e7de1a n/a (/home/thomas/.local/share/Steam/ubuntu12_64/steamoverlayvulkanlayer.so + 0x27e1a) ELF object binary architecture: AMD x86-64 ```

Steps to reproduce

  1. Have an NVIDIA GPU
  2. Launch a GNOME Wayland session
  3. Run vkcube

Expected result

vkcube launches and shows a spinning cube

Actual result

vkcube immediately exits with a segfault

Environment

Repo

Unstable

Desktop Environment

GNOME

System details

System:
  Host: solus-pc Kernel: 6.10.6-300.current arch: x86_64 bits: 64
  Desktop: Budgie v: 10.9.2 Distro: Solus 4.5 resilience
Machine:
  Type: Desktop Mobo: MSI model: Z170A PC MATE (MS-7971) v: 2.0
    serial: <superuser required> UEFI-[Legacy]: American Megatrends v: A.60
    date: 12/17/2015
CPU:
  Info: quad core Intel Core i7-6700K [MT MCP] speed (MHz): avg: 800
    min/max: 800/4200
Graphics:
  Device-1: NVIDIA GP106 [GeForce GTX 1060 6GB] driver: nvidia v: 560.35.03
  Display: x11 server: X.Org v: 21.1.13 with: Xwayland v: 24.1.2 driver: X:
    loaded: nvidia gpu: nvidia,nvidia-nvswitch resolution: 1920x1080~60Hz
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 560.35.03
    renderer: NVIDIA GeForce GTX 1060 6GB/PCIe/SSE2
Network:
  Device-1: Realtek RTL8192CE PCIe Wireless Network Adapter driver: rtl8192ce
  Device-2: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    driver: r8169
Drives:
  Local Storage: total: 1.82 TiB used: 830.3 GiB (44.6%)
Info:
  Memory: total: 32 GiB available: 31.31 GiB used: 3.67 GiB (11.7%)
  Processes: 293 Uptime: 35m Shell: Zsh inxi: 3.3.35

Other comments

No response

joebonrichie commented 3 weeks ago

After rebooting, login to Xorg session first then logout then login to the Wayland session, then it works...

Yes, i know

Staudey commented 3 weeks ago

I completely forgot you talked about such an issue before until I read the workaround steps 😅 Gonna try this as soon as I get back to my PC, thanks!

This would explain why it sometimes randomly worked.

Staudey commented 3 weeks ago

Yeah, that does the trick.