ValveSoftware / gamescope

SteamOS session compositing window manager
Other
2.81k stars 185 forks source link

wlroots: SIGSEGV when special input happened in some games in embedded mode #1364

Open chenx-dust opened 3 weeks ago

chenx-dust commented 3 weeks ago

Checked Games with This Bug:

Behavior:

Trace Back:

#0  0x00005abd81ade906 in wlr_render_timeline_wait (flags=4, func=0x5abd81ade7f0 <surface_commit_handle_fence_available(void*)>, timeline=0x0, point=0, loop=0x5abd89961b80, data=0x5abd89ded7e0)
    at ../gamescope/subprojects/wlroots/render/timeline.c:191
#1  surface_handle_client_commit (listener=0x5abd89ded7b8, data=<optimized out>) at ../gamescope/subprojects/wlroots/types/wlr_linux_drm_syncobj_v1.c:254
#2  0x00007683efd6242e in wl_signal_emit_mutable (signal=<optimized out>, data=0x0) at ../wayland-1.23.0/src/wayland-server.c:2314
#3  0x00005abd81ade399 in surface_handle_commit (client=<optimized out>, resource=<optimized out>) at ../gamescope/subprojects/wlroots/types/wlr_compositor.c:581
#4  0x00007683ef752596 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#5  0x00007683ef74f00e in ffi_call_int (cif=cif@entry=0x7ffe3bd3a110, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#6  0x00007683ef751bd3 in ffi_call (cif=cif@entry=0x7ffe3bd3a110, fn=<optimized out>, rvalue=rvalue@entry=0x0, avalue=avalue@entry=0x7ffe3bd3a1e0) at ../src/x86/ffi64.c:710
#7  0x00007683efd60e45 in wl_closure_invoke (closure=closure@entry=0x5abd89df0300, target=<optimized out>, target@entry=0x5abd89d77960, opcode=opcode@entry=6, data=<optimized out>,
    data@entry=0x5abd89d66950, flags=2) at ../wayland-1.23.0/src/connection.c:1228
#8  0x00007683efd65c42 in wl_client_connection_data (fd=<optimized out>, mask=<optimized out>, data=0x5abd89d66950) at ../wayland-1.23.0/src/wayland-server.c:444
#9  0x00007683efd640a2 in wl_event_loop_dispatch (loop=0x5abd89961b80, timeout=<optimized out>) at ../wayland-1.23.0/src/event-loop.c:1105
#10 0x00005abd819d911f in wlserver_run () at ../gamescope/src/wlserver.cpp:1939
#11 main (argc=<optimized out>, argv=0x7ffe3bd3a938) at ../gamescope/src/main.cpp:1076

Stack 0:

187 struct wl_event_source *wlr_render_timeline_wait(struct wlr_render_timeline *timeline,
188                 uint64_t point, uint32_t flags, struct wl_event_loop *loop,
189                 wlr_render_timeline_wait_func_t func, void *data) {
190         uint32_t signaled_point
> 191       int ret = drmSyncobjTimelineWait(timeline->drm_fd, &timeline->handle, &point, 1, 0, flags, &signaled_point);
192         if (ret == 0) {
193                 func(data);
194                 return NULL;
195         } else if (ret != -ETIME) {
196                 wlr_log_errno(WLR_ERROR, "drmSyncobjWait() failed");
197                 return NULL;
198         }

Null pointer timeline:

(gdb) print timeline
$1 = (wlr_render_timeline *) 0x0

Stack 1:

194 static void surface_handle_client_commit(struct wl_listener *listener,
195                 void *data) {
196         struct wlr_linux_drm_syncobj_surface_v1 *surface =
197                 wl_container_of(listener, surface, client_commit);
...
> 254       wlr_render_timeline_wait(surface->pending.acquire_timeline, surface->pending.acquire_point,
255                 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE, loop,
256                 surface_commit_handle_fence_available, commit);
257 }

Empty child pending of surface:

(gdb) print * surface
$3 = {resource = 0x5abd89d7a300, surface = 0x5abd89def2d0, pending = {acquire_timeline = 0x0, acquire_point = 0, release_timeline = 0x0, release_point = 0}, current = {acquire_timeline = 0x5abd89d2fa40, acquire_point = 1413, release_timeline = 0x5abd89d2d5f0, release_point = 1413}, addon = {impl = 0x5abd81bfbd90 <surface_addon_impl.lto_priv>, owner = 0x5abd89b63400, link = {prev = 0x5abd89def608, next = 0x5abd89d64ba0}}, synced = {surface = 0x5abd89def2d0, impl = 0x5abd81bfbda0 <surface_synced_impl.lto_priv>, link = {prev = 0x5abd89def660, next = 0x5abd89df0690}, index = 1}, client_commit = { link = {prev = 0x5abd89def598, next = 0x7ffe3bd39eb0}, notify = 0x5abd81ade820 <surface_handle_client_commit(wl_listener*, void*)>}}
THMonster commented 2 weeks ago

I have very similar problem.

gamescope embedded mode AMD RNDA2/3(680m and 7600) MESA 24.1 and later(now 24.1.1)

On a little bit older version of gamescope(about 3.14.17). Games do start normally, but most of them will crash(gamescope crash) when you click "start game". On the latest git version of gamescope, gamescope will crash immediately when you move the mouse pointer. But if you use controller only, it won't crash and you can play games normally.

Here ia a crash log:


                                                           Stack trace of thread 22490:
                                                           #0  0x00006389f2ff3ed6 n/a (gamescope + 0x12ded6)
                                                           #1  0x000076024223c42e wl_signal_emit_mutable (libwayland-server.so.0 + 0x842e)
                                                           #2  0x00006389f2ff3969 n/a (gamescope + 0x12d969)
                                                           #3  0x0000760241d26596 n/a (libffi.so.8 + 0x7596)
                                                           #4  0x0000760241d2300e n/a (libffi.so.8 + 0x400e)
                                                           #5  0x0000760241d25bd3 ffi_call (libffi.so.8 + 0x6bd3)
                                                           #6  0x000076024223ae45 n/a (libwayland-server.so.0 + 0x6e45)
                                                           #7  0x000076024223fc42 n/a (libwayland-server.so.0 + 0xbc42)
                                                           #8  0x000076024223e0a2 wl_event_loop_dispatch (libwayland-server.so.0 + 0xa0a2)
                                                           #9  0x00006389f2eec3c0 n/a (gamescope + 0x263c0)
                                                           #10 0x000076024174ec88 n/a (libc.so.6 + 0x25c88)
                                                           #11 0x000076024174ed4c __libc_start_main (libc.so.6 + 0x25d4c)
                                                           #12 0x00006389f2f016f5 n/a (gamescope + 0x3b6f5)

                                                           Stack trace of thread 22494:
                                                           #0  0x000076024183f4e2 epoll_wait (libc.so.6 + 0x1164e2)
                                                           #1  0x00006389f301a336 n/a (gamescope + 0x154336)
                                                           #2  0x00006389f2f2f25c n/a (gamescope + 0x6925c)
                                                           #3  0x0000760241ae0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
                                                           #4  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #5  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22588:
                                                           #0  0x000076024183a9ed ioctl (libc.so.6 + 0x1119ed)
                                                           #1  0x000076024224f4c1 drmIoctl (libdrm.so.2 + 0x74c1)
                                                           #2  0x0000760242253f7b drmSyncobjTimelineWait (libdrm.so.2 + 0xbf7b)
                                                           #3  0x0000760235de31bb n/a (libvulkan_radeon.so + 0x1e31bb)
                                                           #4  0x0000760235ddada5 n/a (libvulkan_radeon.so + 0x1dada5)
                                                           #5  0x0000760235dda91c n/a (libvulkan_radeon.so + 0x1da91c)
                                                           #6  0x00006389f2f3ca4e n/a (gamescope + 0x76a4e)
                                                           #7  0x00006389f2f624aa n/a (gamescope + 0x9c4aa)
                                                           #8  0x00006389f2f18598 n/a (gamescope + 0x52598)
                                                           #9  0x00006389f2f2c0d8 n/a (gamescope + 0x660d8)
                                                           #10 0x00006389f2f2e360 n/a (gamescope + 0x68360)
                                                           #11 0x0000760241ae0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
                                                           #12 0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #13 0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22493:
                                                           #0  0x000076024183f4e2 epoll_wait (libc.so.6 + 0x1164e2)
                                                           #1  0x00006389f301a336 n/a (gamescope + 0x154336)
                                                           #2  0x00006389f2f36278 n/a (gamescope + 0x70278)
                                                           #3  0x0000760241ae0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
                                                           #4  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #5  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22589:
                                                           #0  0x00007602418310b5 __open64 (libc.so.6 + 0x1080b5)
                                                           #1  0x00006389f2f0d947 n/a (gamescope + 0x47947)
                                                           #2  0x0000760241ae0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
                                                           #3  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #4  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22498:
                                                           #0  0x0000760241807f43 clock_nanosleep (libc.so.6 + 0xdef43)
                                                           #1  0x0000760241813d77 __nanosleep (libc.so.6 + 0xead77)
                                                           #2  0x0000760235d16b72 n/a (libvulkan_radeon.so + 0x116b72)
                                                           #3  0x0000760235e93b9d n/a (libvulkan_radeon.so + 0x293b9d)
                                                           #4  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #5  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22529:
                                                           #0  0x000076024183139d __poll (libc.so.6 + 0x10839d)
                                                           #1  0x00006389f2f59189 n/a (gamescope + 0x93189)
                                                           #2  0x0000760241ae0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
                                                           #3  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #4  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22585:
                                                           #0  0x000076024183f4e2 epoll_wait (libc.so.6 + 0x1164e2)
                                                           #1  0x000076023703a197 n/a (libspa-support.so + 0x15197)
                                                           #2  0x000076023702ba21 n/a (libspa-support.so + 0x6a21)
                                                           #3  0x0000760241e21103 n/a (libpipewire-0.3.so.0 + 0x1c103)
                                                           #4  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #5  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22497:
                                                           #0  0x00007602417b84e9 n/a (libc.so.6 + 0x8f4e9)
                                                           #1  0x00007602417baed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
                                                           #2  0x0000760235e93c6e n/a (libvulkan_radeon.so + 0x293c6e)
                                                           #3  0x0000760235e6cc1c n/a (libvulkan_radeon.so + 0x26cc1c)
                                                           #4  0x0000760235e93b9d n/a (libvulkan_radeon.so + 0x293b9d)
                                                           #5  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #6  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22587:
                                                           #0  0x000076024183139d __poll (libc.so.6 + 0x10839d)
                                                           #1  0x00006389f2f6bed3 n/a (gamescope + 0xa5ed3)
                                                           #2  0x0000760241ae0c84 execute_native_thread_routine (libstdc++.so.6 + 0xe0c84)
                                                           #3  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #4  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22510:
                                                           #0  0x00007602417b84e9 n/a (libc.so.6 + 0x8f4e9)
                                                           #1  0x00007602417baed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
                                                           #2  0x0000760235e93c6e n/a (libvulkan_radeon.so + 0x293c6e)
                                                           #3  0x0000760235e6cc1c n/a (libvulkan_radeon.so + 0x26cc1c)
                                                           #4  0x0000760235e93b9d n/a (libvulkan_radeon.so + 0x293b9d)
                                                           #5  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #6  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22513:
                                                           #0  0x00007602417b84e9 n/a (libc.so.6 + 0x8f4e9)
                                                           #1  0x00007602417baed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
                                                           #2  0x0000760235e93c6e n/a (libvulkan_radeon.so + 0x293c6e)
                                                           #3  0x0000760235e6cc1c n/a (libvulkan_radeon.so + 0x26cc1c)
                                                           #4  0x0000760235e93b9d n/a (libvulkan_radeon.so + 0x293b9d)
                                                           #5  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #6  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)

                                                           Stack trace of thread 22518:
                                                           #0  0x00007602417b84e9 n/a (libc.so.6 + 0x8f4e9)
                                                           #1  0x00007602417baed9 pthread_cond_wait (libc.so.6 + 0x91ed9)
                                                           #2  0x0000760235e93c6e n/a (libvulkan_radeon.so + 0x293c6e)
                                                           #3  0x0000760235e6cc1c n/a (libvulkan_radeon.so + 0x26cc1c)
                                                           #4  0x0000760235e93b9d n/a (libvulkan_radeon.so + 0x293b9d)
                                                           #5  0x00007602417bbded n/a (libc.so.6 + 0x92ded)
                                                           #6  0x000076024183f0dc n/a (libc.so.6 + 0x1160dc)
                                                           ELF object binary architecture: AMD x86-64
Joshua-Ashton commented 2 weeks ago

@emersion Any idea why this code is being called? We don't want to use it, right? We already handle this ourselves.

emersion commented 2 weeks ago

This is already fixed in the latest iteration of the wlroots MR I believe, but gamescope seems to use an earlier version.

Any idea why this code is being called? We don't want to use it, right? We already handle this ourselves.

Yeah. It doesn't matter at the moment since clients send already-materialized fences but we should fix it.

gordon-quad commented 2 weeks ago

Any suggestion for a workaround for now?

chenx-dust commented 2 weeks ago

Any suggestion for a workaround for now?

Rolling back to 3.14.2, or other version using old wlroots.

gordon-quad commented 2 weeks ago

3.14.2 segfaults too but differently #1202

emersion commented 2 weeks ago

wlroots has merged the linux-drm-syncobj-v1 implementation, would be nice to upgrade gamescope to use that instead of the old WIP patches.

sharkautarch commented 1 week ago

I encountered a similar but slightly different sigsegv w/ latest gamescope when running qemu gtk w/ zink on and gamescope WSI disabled, currently working on debugging it w/ valgrind:

 Process terminating with default action of signal 11 (SIGSEGV): dumping core
==00:00:14:34.338 476165==  Access not within mapped region at address 0x0
==00:00:14:34.338 476165==    at 0x2340BE: surface_handle_client_commit (in /usr/bin/gamescope)
==00:00:14:34.338 476165==    by 0x4A4E01D: wl_signal_emit_mutable (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:14:34.338 476165==    by 0x2346E1: surface_handle_commit (in /usr/bin/gamescope)
==00:00:14:34.338 476165==    by 0x55BD595: ffi_call_unix64 (unix64.S:104)
==00:00:14:34.338 476165==    by 0x55BA00D: ffi_call_int.lto_priv.0 (ffi64.c:673)
==00:00:14:34.338 476165==    by 0x55BCBD2: ffi_call (ffi64.c:710)
==00:00:14:34.338 476165==    by 0x4A4CAD9: ??? (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:14:34.338 476165==    by 0x4A5117F: ??? (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:14:34.338 476165==    by 0x4A4FAE1: wl_event_loop_dispatch (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:14:34.338 476165==    by 0x13EA17: main (in /usr/bin/gamescope)

EDIT: in another run (same exact arguments and whatnot) I got the exact same backtrace as the one found by others in this issue post

==00:00:25:15.180 520291== Invalid read of size 4
==00:00:25:15.180 520291==    at 0x25B89C: wlr_render_timeline_wait (timeline.c:191)
==00:00:25:15.180 520291==    by 0x4A4E01D: wl_signal_emit_mutable (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:25:15.180 520291==    by 0x2761C1: surface_handle_commit (wlr_compositor.c:581)
==00:00:25:15.180 520291==    by 0x54C5595: ffi_call_unix64 (unix64.S:104)
==00:00:25:15.180 520291==    by 0x54C200D: ffi_call_int.lto_priv.0 (ffi64.c:673)
==00:00:25:15.180 520291==    by 0x54C4BD2: ffi_call (ffi64.c:710)
==00:00:25:15.180 520291==    by 0x4A4CAD9: ??? (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:25:15.180 520291==    by 0x4A5117F: ??? (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:25:15.180 520291==    by 0x4A4FAE1: wl_event_loop_dispatch (in /usr/lib/libwayland-server.so.0.22.0)
==00:00:25:15.180 520291==    by 0x1A3CD8: wlserver_run() (wlserver.cpp:1943)
==00:00:25:15.180 520291==    by 0x13F1EE: main (main.cpp:1047)
==00:00:25:15.180 520291==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

will debug this soon

sharkautarch commented 1 week ago

@emersion When you said

This is already fixed in the latest iteration of the wlroots MR I believe, but gamescope seems to use an earlier version

Was this issue fixed as of commit: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/90/diffs?commit_id=ae9ed7ac14e02021bc9f200aff19ff1601a1ae61

Or was it a commit after that one?

2nd UPDATE: I tried to recompile gamescope w/ the latest upstream gitlab version of wayland-protocols, but still encountered the same issue does the wlroots version used by gamescope need to be updated as well?

3rd UPDATE: so I decided to try to update the version of wlroots being used w/ gamescope to the latest version... it broke some stuff, but somehow I've managed to fix the incompatibilities w/ the newer wlroots version... (at least to where I can successfully run glxgears)

I'll have to see if this issue I encountered when running qemu gtk + zink ontop of gamescope still occurs w/ latest wlroots, but later, because it was a bit of an pain to try to migrate over to the newer wlroots...