WayfireWM / wayfire

A modular and extensible wayland compositor
https://wayfire.org/
MIT License
2.41k stars 179 forks source link

Power-cycling monitor causes SDL2 programs to crash #2384

Closed vanfanel closed 4 months ago

vanfanel commented 5 months ago

Describe the bug Having an SDL2 program running and turning the monitor off, then on again, the program always crashes. Happens with any SDL2 program, be it windowed or fullscreen. RetroArch doesn't seem to be affected at all.

SDL2 programs survive monitor power-cycling without any problems on other WLRoots-based compositors like Sway.

To Reproduce Steps to reproduce the behavior:

  1. Run an SDL2 program, be it fullscreen or windowed.
  2. Turn off monitor.
  3. Turn on monitor and see how it has crashed.

Expected behavior Turning monitor off, then on again, should not crash SDL2 programs.

Screenshots or stacktrace

The SDL2 programs DO survive monitor being turned off: what crashes them is turning the monitor back on. Here's what appears when I launch an example SDL2 program (SDLPop):

DD 19-06-24 11:23:45.947 - [src/core/idle.cpp:20] creating idle inhibitor 0x5d21feb1d6d0 previous count: 0
II 19-06-24 11:23:45.947 - [src/view/xdg-shell/xdg-toplevel-view.cpp:29] new xdg_shell_stable surface: Prince of Persia (SDLPoP) v1.23 app-id: prince.exe                                                                                                                     
DD 19-06-24 11:23:45.966 - [src/core/idle.cpp:20] creating idle inhibitor 0x5d21fee34a68 previous count: 1
DD 19-06-24 11:23:45.966 - [src/output/promotion-manager.hpp:107] autohide panels

When the monitor is turned off, there are no messages on the console. The program is still running while the monitor is off.

And this is what appears on the console from which Wayfire is run, when monitor is turned on again (after being turned off):


II 19-06-24 11:24:29.366 - [backend/drm/drm.c:1549] Scanning DRM connector 185 on /dev/dri/card1
II 19-06-24 11:24:29.373 - [backend/drm/drm.c:1636] 'HDMI-A-1' disconnected
II 19-06-24 11:24:29.373 - [src/core/output-layout.cpp:1212] remove output: HDMI-A-1
II 19-06-24 11:24:29.373 - [src/core/output-layout.cpp:1125] new output: NOOP-1
II 19-06-24 11:24:29.373 - [src/core/output-layout.cpp:460] loaded mode auto
II 19-06-24 11:24:29.373 - [src/core/output-layout.cpp:634] Couldn't find matching mode 1280x720@0 for output NOOP-1. Trying to use custom mode(might not work)                                                                                                               
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input Power Button to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "Power Button" to output (not found in this cursor)
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input Video Bus to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "Video Bus" to output (not found in this cursor)
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input Power Button to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "Power Button" to output (not found in this cursor)
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input Sleep Button to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "Sleep Button" to output (not found in this cursor)
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input 8BitDo 8BitDo Retro Keyboard Receiver to output null.
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input 8BitDo 8BitDo Retro Keyboard Receiver Keyboard to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "8BitDo 8BitDo Retro Keyboard Receiver Keyboard" to output (not found in this cursor)                                                                                                                  
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input 8BitDo 8BitDo Retro Keyboard Receiver Keyboard to output null.
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "MOSART Semi. 2.4G Keyboard Mouse" to output (not found in this cursor)                                                                                                                                
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse to output null.
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse Consumer Control to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "MOSART Semi. 2.4G Keyboard Mouse Consumer Control" to output (not found in this cursor)                                                                                                               
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse Consumer Control to output null.
DD 19-06-24 11:24:29.374 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse System Control to output null.
EE 19-06-24 11:24:29.374 - [types/wlr_cursor.c:1174] Cannot map device "MOSART Semi. 2.4G Keyboard Mouse System Control" to output (not found in this cursor)                                                                                                                 
EE 19-06-24 11:24:29.375 - [src/core/output-layout.cpp:533] disabling output: HDMI-A-1
DD 19-06-24 11:24:29.375 - [src/core/idle.cpp:27] destroying idle inhibitor 0x5d21fee34a68 previous count: 2
II 19-06-24 11:24:29.376 - [src/core/output-layout.cpp:163] transfer views from HDMI-A-1 -> NOOP-1
DD 19-06-24 11:24:29.376 - [src/output/promotion-manager.hpp:123] restore panels
DD 19-06-24 11:24:29.376 - [src/core/idle.cpp:20] creating idle inhibitor 0x5d21ff2b5bf8 previous count: 1
DD 19-06-24 11:24:29.376 - [src/output/promotion-manager.hpp:107] autohide panels
DD 19-06-24 11:24:29.379 - [src/core/idle.cpp:27] destroying idle inhibitor 0x5d21ff2b5bf8 previous count: 2
DD 19-06-24 11:24:29.379 - [src/output/promotion-manager.hpp:123] restore panels
II 19-06-24 11:24:29.379 - [backend/drm/drm.c:790] connector HDMI-A-1: Turning off
II 19-06-24 11:24:29.583 - [backend/drm/drm.c:1549] Scanning DRM connector 185 on /dev/dri/card1
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1628] 'HDMI-A-1' connected
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1436] Detected modes:
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 60.000 Hz (preferred)
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 143.766 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 120.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 119.880 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 100.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 59.940 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 50.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 30.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 29.970 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1920x1080 @ 25.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1600x1200 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1680x1050 @ 59.883 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1400x1050 @ 59.948 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x1024 @ 75.025 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x1024 @ 60.020 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1440x900 @ 59.901 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x960 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1152x864 @ 75.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x720 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x720 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x720 @ 59.940 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1280x720 @ 50.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1024x768 @ 75.029 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1024x768 @ 70.069 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   1024x768 @ 60.004 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   832x624 @ 74.551 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   800x600 @ 75.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   800x600 @ 72.188 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   800x600 @ 60.317 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   800x600 @ 56.250 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   720x576 @ 50.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   720x480 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   720x480 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   720x480 @ 59.940 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   720x480 @ 59.940 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   640x480 @ 75.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   640x480 @ 72.809 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   640x480 @ 66.667 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   640x480 @ 60.000 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   640x480 @ 59.940 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   640x480 @ 59.940 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1464]   720x400 @ 70.082 Hz 
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1484] Physical size: 530x300
II 19-06-24 11:24:29.610 - [backend/drm/drm.c:1664] connector HDMI-A-1: Requesting modeset
II 19-06-24 11:24:29.610 - [src/core/output-layout.cpp:1177] new output: HDMI-A-1 ("ViewSonic Corporation XG2401 SERIES UG2184100356")
II 19-06-24 11:24:29.610 - [src/core/output-layout.cpp:460] loaded mode auto
II 19-06-24 11:24:29.610 - [src/core/output-layout.cpp:1125] new output: NOOP-1
II 19-06-24 11:24:29.610 - [src/core/output-layout.cpp:460] loaded mode auto
II 19-06-24 11:24:29.610 - [src/core/output-layout.cpp:634] Couldn't find matching mode 1280x720@0 for output NOOP-1. Trying to use custom mode(might not work)                                                                                                               
II 19-06-24 11:24:29.612 - [backend/drm/drm.c:786] connector HDMI-A-1: Modesetting with 1920x1080 @ 60.000 Hz
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input Power Button to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "Power Button" to output (not found in this cursor)
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input Video Bus to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "Video Bus" to output (not found in this cursor)
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input Power Button to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "Power Button" to output (not found in this cursor)
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input Sleep Button to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "Sleep Button" to output (not found in this cursor)
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input 8BitDo 8BitDo Retro Keyboard Receiver to output null.
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input 8BitDo 8BitDo Retro Keyboard Receiver Keyboard to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "8BitDo 8BitDo Retro Keyboard Receiver Keyboard" to output (not found in this cursor)                                                                                                                  
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input 8BitDo 8BitDo Retro Keyboard Receiver Keyboard to output null.
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "MOSART Semi. 2.4G Keyboard Mouse" to output (not found in this cursor)                                                                                                                                
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse to output null.
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse Consumer Control to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "MOSART Semi. 2.4G Keyboard Mouse Consumer Control" to output (not found in this cursor)                                                                                                               
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse Consumer Control to output null.
DD 19-06-24 11:24:29.665 - [src/core/seat/input-manager.cpp:98] Mapping input MOSART Semi. 2.4G Keyboard Mouse System Control to output null.
EE 19-06-24 11:24:29.665 - [types/wlr_cursor.c:1174] Cannot map device "MOSART Semi. 2.4G Keyboard Mouse System Control" to output (not found in this cursor)                                                                                                                 
II 19-06-24 11:24:30.666 - [src/core/output-layout.cpp:1165] remove output: NOOP-1
EE 19-06-24 11:24:30.666 - [src/core/output-layout.cpp:533] disabling output: NOOP-1
II 19-06-24 11:24:30.666 - [src/core/output-layout.cpp:163] transfer views from NOOP-1 -> HDMI-A-1
DD 19-06-24 11:24:30.666 - [src/output/promotion-manager.hpp:107] autohide panels
DD 19-06-24 11:24:30.667 - [src/output/promotion-manager.hpp:123] restore panels
DD 19-06-24 11:24:30.667 - [src/core/idle.cpp:20] creating idle inhibitor 0x5d21fee38118 previous count: 1
DD 19-06-24 11:24:30.667 - [src/output/promotion-manager.hpp:107] autohide panels
II 19-06-24 11:24:30.667 - [backend/drm/drm.c:786] connector HDMI-A-1: Modesetting with 1920x1080 @ 60.000 Hz
DD 19-06-24 11:24:30.749 - [src/core/idle.cpp:27] destroying idle inhibitor 0x5d21fee38118 previous count: 2
DD 19-06-24 11:24:30.749 - [src/output/promotion-manager.hpp:123] restore panels
DD 19-06-24 11:24:30.749 - [src/core/idle.cpp:27] destroying idle inhibitor 0x5d21feb1d6d0 previous count: 1

Wayfire version Latest GIT code.

ammen99 commented 5 months ago

Alright, so why is this a bug in Wayfire and not in those programs you mentioned?

vanfanel commented 5 months ago

Alright, so why is this a bug in Wayfire and not in those programs you mentioned?

Because it doesn't happen at all on other WLRoots-based compositors, like Sway: monitor can be repeatedly power-cycled without any problems or crashes.

I have added that information to the first post.

ammen99 commented 5 months ago

Wayfire creates a temporary output when all the other outputs are disconnected and this could be the reason why the apps crash. However it could still be their bug :)

I would need more information in order to work on a potential fix, if this is a bug in Wayfire at all. Like, you or someone else would need to figure out what we're doing wrong so that the apps crash. What stacktrace(s) do they crash with? What is the output of WAYLAND_DEBUG=1 <crashing app> when you reproduce the bug?

vanfanel commented 5 months ago

@ammen99 This is a debug session of a crashing SDL2 app as I turn ON monitor (any SDL2 program will do), on a DEBUG Wayfire build:

(gdb) r
Starting program: /root/prince/prince.exe 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7a006c0 (LWP 2596)]
[New Thread 0x7fffea8006c0 (LWP 2597)]
[New Thread 0x7fffe9e006c0 (LWP 2598)]
[New Thread 0x7fffe94006c0 (LWP 2599)]
[New Thread 0x7fffe8a006c0 (LWP 2600)]
[New Thread 0x7fffe3c006c0 (LWP 2601)]
[New Thread 0x7fffe32006c0 (LWP 2602)]

Thread 1 "prince.exe" received signal SIGSEGV, Segmentation fault.
0x00007ffff7b097e4 in prepare_zombie.isra ()
   from /usr/local/lib/x86_64-linux-gnu/libwayland-client.so.0
(gdb) bt
#0  0x00007ffff7b097e4 in prepare_zombie.isra ()
   from /usr/local/lib/x86_64-linux-gnu/libwayland-client.so.0
#1  0x00007ffff7b09ec8 in wl_proxy_destroy ()
   from /usr/local/lib/x86_64-linux-gnu/libwayland-client.so.0
#2  0x00007ffff7eedf15 in display_remove_global () from /usr/local/lib/libSDL2-2.0.so.0
#3  0x00007ffff7afdf7a in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#4  0x00007ffff7afd40e in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#5  0x00007ffff7afdb0d in ffi_call () from /lib/x86_64-linux-gnu/libffi.so.8
#6  0x00007ffff7b0e44c in wl_closure_invoke ()
   from /usr/local/lib/x86_64-linux-gnu/libwayland-client.so.0
#7  0x00007ffff7b09c4f in dispatch_event ()
   from /usr/local/lib/x86_64-linux-gnu/libwayland-client.so.0
#8  0x00007ffff7b0adfb in wl_display_dispatch_queue_pending ()
   from /usr/local/lib/x86_64-linux-gnu/libwayland-client.so.0
#9  0x00007ffff7ee9cda in Wayland_PumpEvents () from /usr/local/lib/libSDL2-2.0.so.0
#10 0x00007ffff7e2e133 in SDL_PumpEventsInternal () from /usr/local/lib/libSDL2-2.0.so.0
#11 0x00007ffff7e2e44a in SDL_WaitEventTimeout_REAL () from /usr/local/lib/libSDL2-2.0.so.0
#12 0x0000555555579db0 in process_events ()
#13 0x000055555557be4f in do_simple_wait ()
#14 0x00005555555650c9 in play_level_2 ()
#15 0x00005555555653f7 in play_level ()
#16 0x00005555555655c9 in init_game ()
#17 0x000055555555d80e in start_game ()
#18 0x000055555555f9a3 in pop_main ()
#19 0x000055555555a9d6 in main ()

If you need me to build any lower libs in DEBUG mode, please tell me and I will do so.

And this is what I get with WAYLAND_DEBUG=1 <crashing app>:

prince.log

Hope it helps. If you need me to do further experiments, please tell me.

ammen99 commented 4 months ago

Any chance of getting debug symbols from SDL and libwayland-client.so ? Or at least tell me which version of sdl2 you are using so that I can see what display_remove_global in SDL2 does.

ammen99 commented 4 months ago

Looking at the wayland log though I don't see anything which Wayfire could be doing wrong so I suspect it is a client bug.

vanfanel commented 4 months ago

@ammen99 This is the GDB bt with debug builds of both latest stable libwayland (wayland-1.23.0) and latest stable libSDL2 (SDL2-2.30.4):


Thread 1 "prince.exe" received signal SIGSEGV, Segmentation fault.
0x00007ffff7f7cde6 in prepare_zombie (proxy=0x55555567c1e0)
    at ../src/wayland-client.c:443
443     for (i = 0; i < interface->event_count; i++) {
(gdb) bt
#0  0x00007ffff7f7cde6 in prepare_zombie (proxy=0x55555567c1e0) at ../src/wayland-client.c:443
#1  0x00007ffff7f7d0ab in proxy_destroy (proxy=0x55555567c1e0) at ../src/wayland-client.c:570
#2  0x00007ffff7f7d189 in wl_proxy_destroy_caller_locks (proxy=0x55555567c1e0)
    at ../src/wayland-client.c:598
#3  0x00007ffff7f7d1c2 in wl_proxy_destroy (proxy=0x55555567c1e0) at ../src/wayland-client.c:621
#4  0x00007ffff7de995e in wl_output_destroy (wl_output=0x55555567c1e0)
    at gen/wayland-client-protocol.h:5726
#5  0x00007ffff7deb7de in Wayland_free_display (d=0x555555623af0, id=39)
    at /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:758
#6  0x00007ffff7debf11 in display_remove_global (data=0x555555623af0, registry=0x555555624d80, 
    id=39) at /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:887
#7  0x00007ffff7f70f7a in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#8  0x00007ffff7f7040e in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#9  0x00007ffff7f70b0d in ffi_call () from /lib/x86_64-linux-gnu/libffi.so.8
#10 0x00007ffff7f81f50 in wl_closure_invoke (closure=0x55555682e0e0, flags=1, 
    target=0x555555624d80, opcode=1, data=0x555555623af0) at ../src/connection.c:1228
#11 0x00007ffff7f7eabc in dispatch_event (display=0x55555561f850, queue=0x55555561f948)
    at ../src/wayland-client.c:1670
#12 0x00007ffff7f7eda2 in dispatch_queue (display=0x55555561f850, queue=0x55555561f948)
    at ../src/wayland-client.c:1816
#13 0x00007ffff7f7f055 in wl_display_dispatch_queue_pending (display=0x55555561f850, 
    queue=0x55555561f948) at ../src/wayland-client.c:2058
#14 0x00007ffff7f7f0bd in wl_display_dispatch_pending (display=0x55555561f850)
    at ../src/wayland-client.c:2121
#15 0x00007ffff7de212d in Wayland_PumpEvents (_this=0x555555623d30)
    at /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandevents.c:380
#16 0x00007ffff7c58616 in SDL_PumpEventsInternal (push_sentinel=SDL_TRUE)
--Type <RET> for more, q to quit, c to continue without paging--
    at /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:918
#17 0x00007ffff7c589ef in SDL_WaitEventTimeout_REAL (event=0x7fffffffe0d0, timeout=0)
    at /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:1093
#18 0x00007ffff7c586e6 in SDL_PollEvent_REAL (event=0x7fffffffe0d0)
    at /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:960
#19 0x00007ffff7c4aa90 in SDL_PollEvent (a=0x7fffffffe0d0)
    at /root/src/sdl/SDL2-2.30.4/src/dynapi/SDL_dynapi_procs.h:156
#20 0x0000555555579db0 in process_events ()
#21 0x000055555557c3ba in do_wait ()
#22 0x000055555555ea48 in show_title ()
#23 0x000055555555d827 in start_game ()
#24 0x000055555555f9a3 in pop_main ()
#25 0x000055555555a9d6 in main ()

I think this is what you asked me for, if you need anything more, ask me. I really want to help fixing this.

Also, since I am using SDL2 2.30.4, this exactly is what display_remove_global is doing:

https://github.com/libsdl-org/SDL/blob/53cf48f5057b1c1c901d52ba521f866ddd60d4c3/src/video/wayland/SDL_waylandvideo.c#L883

It's simply calling wayland_free_display, which is here:

https://github.com/libsdl-org/SDL/blob/53cf48f5057b1c1c901d52ba521f866ddd60d4c3/src/video/wayland/SDL_waylandvideo.c#L730

I truly hope it helps. If you need anything else, as I said, just ask me. I'll be leaving debug versions of libwayland, wayfire and SDL2 for now, in case you need me to look at anything else.

ammen99 commented 4 months ago

I highly recommend updating sdl2, I looked at their git and there were many changes in this part of the code.

vanfanel commented 4 months ago

@ammen99 If you're looking at the main branch, you're looking at SDL3, which is a non-released SDL version with many API changes (not directly compatible with SDL2 programs at source level).

Looking at the SDL2 branch here (which is what current SDL2 games and programs use): https://github.com/libsdl-org/SDL/tree/SDL2

...you can see by looking at https://github.com/libsdl-org/SDL/commits/SDL2/src/video/wayland/SDL_waylandvideo.c thatSDL_waylandvideo.c hasn't changed in three months now :(

ammen99 commented 4 months ago

Ah well, I was looking at SDL3. They have the correct code for this, using wl_output.release instead of wl_output.destroy. Can you try locally patching SDL2 to use wl_output_release instead of wl_output_destroy?

ammen99 commented 4 months ago

Oh,, they bind the interface at version 2. You'd also have to bind wl_output at version 3 in the following line here:

    output = wl_registry_bind(d->registry, id, &wl_output_interface, 2);

This is line 696.

ammen99 commented 4 months ago

@vanfanel I am looking over the SDL code and I cannot understand how it is supposed to work, does it not contain a double free bug? Looking at this here:

https://github.com/libsdl-org/SDL/blob/SDL2/src/video/wayland/SDL_waylandvideo.c#L764

This frees the driverdata:

https://github.com/libsdl-org/SDL/blob/SDL2/src/video/SDL_video.c#L681

Which is the same as the data pointer here:

https://github.com/libsdl-org/SDL/blob/SDL2/src/video/wayland/SDL_waylandvideo.c#L741

Which is used after the free call:

https://github.com/libsdl-org/SDL/blob/SDL2/src/video/wayland/SDL_waylandvideo.c#L765-L769

Is this not a use-after-free? You can verify this by running the game or really any SDL demo app with address sanitizer.

vanfanel commented 4 months ago

Ah well, I was looking at SDL3. They have the correct code for this, using wl_output.release instead of wl_output.destroy. Can you try locally patching SDL2 to use wl_output_release instead of wl_output_destroy?

After doing this change and binding wl_output at version 3 as you said, it's still crashing:

Thread 1 "prince.exe" received signal SIGSEGV, Segmentation fault.
wl_proxy_marshal_flags (proxy=0x55555575a2b0, opcode=0, interface=0x0, version=0, flags=1)
    at ../src/wayland-client.c:853
853     wl_argument_from_va_list(proxy->object.interface->methods[opcode].signature,
(gdb) bt
#0  wl_proxy_marshal_flags (proxy=0x55555575a2b0, opcode=0, interface=0x0, version=0, flags=1)
    at ../src/wayland-client.c:853
#1  0x00007ffff7de99ad in wl_output_release (wl_output=0x55555575a2b0)
    at gen/wayland-client-protocol.h:5738
#2  0x00007ffff7deb831 in Wayland_free_display (d=0x555555623af0, id=41)
    at /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:762
#3  0x00007ffff7debf64 in display_remove_global (data=0x555555623af0, registry=0x555555624d80, 
    id=41) at /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:892
#4  0x00007ffff7f70f7a in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#5  0x00007ffff7f7040e in ?? () from /lib/x86_64-linux-gnu/libffi.so.8
#6  0x00007ffff7f70b0d in ffi_call () from /lib/x86_64-linux-gnu/libffi.so.8
#7  0x00007ffff7f81f50 in wl_closure_invoke (closure=0x5555568234e0, flags=1, 
    target=0x555555624d80, opcode=1, data=0x555555623af0) at ../src/connection.c:1228
#8  0x00007ffff7f7eabc in dispatch_event (display=0x55555561f850, queue=0x55555561f948)
    at ../src/wayland-client.c:1670
#9  0x00007ffff7f7eda2 in dispatch_queue (display=0x55555561f850, queue=0x55555561f948)
    at ../src/wayland-client.c:1816
#10 0x00007ffff7f7f055 in wl_display_dispatch_queue_pending (display=0x55555561f850, 
    queue=0x55555561f948) at ../src/wayland-client.c:2058
#11 0x00007ffff7f7f0bd in wl_display_dispatch_pending (display=0x55555561f850)
    at ../src/wayland-client.c:2121
#12 0x00007ffff7de212d in Wayland_PumpEvents (_this=0x555555623d30)
    at /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandevents.c:380
#13 0x00007ffff7c58616 in SDL_PumpEventsInternal (push_sentinel=SDL_TRUE)
    at /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:918
#14 0x00007ffff7c589ef in SDL_WaitEventTimeout_REAL (event=0x7fffffffe0d0, timeout=0)
    at /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:1093
--Type <RET> for more, q to quit, c to continue without paging--
#15 0x00007ffff7c586e6 in SDL_PollEvent_REAL (event=0x7fffffffe0d0)
    at /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:960
#16 0x00007ffff7c4aa90 in SDL_PollEvent (a=0x7fffffffe0d0)
    at /root/src/sdl/SDL2-2.30.4/src/dynapi/SDL_dynapi_procs.h:156
#17 0x0000555555579db0 in process_events ()
#18 0x000055555557c3ba in do_wait ()
#19 0x000055555555ea48 in show_title ()
#20 0x000055555555d827 in start_game ()
#21 0x000055555555f9a3 in pop_main ()
#22 0x000055555555a9d6 in main ()

Now let me try that address sanitizer thing (I have to investigate how that works)

vanfanel commented 4 months ago

@ammen99 Here's the address sanitized log of the example game running with address sanitizer support built into SDL2 and the game itself:

~/prince$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libasan.so.8 prince.exe 
=================================================================
==13225==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000191e50 at pc 0x74e41c025510 bp 0x7fff094664b0 sp 0x7fff094664a8
READ of size 8 at 0x611000191e50 thread T0
    #0 0x74e41c02550f in Wayland_free_display /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:757
    #1 0x74e41c0261c3 in display_remove_global /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:892
    #2 0x74e41b7e8f79  (/lib/x86_64-linux-gnu/libffi.so.8+0x6f79)
    #3 0x74e41b7e840d  (/lib/x86_64-linux-gnu/libffi.so.8+0x640d)
    #4 0x74e41b7e8b0c in ffi_call (/lib/x86_64-linux-gnu/libffi.so.8+0x6b0c)
    #5 0x74e41b7f9f4f in wl_closure_invoke ../src/connection.c:1228
    #6 0x74e41b7f6abb in dispatch_event ../src/wayland-client.c:1670
    #7 0x74e41b7f6da1 in dispatch_queue ../src/wayland-client.c:1816
    #8 0x74e41b7f7054 in wl_display_dispatch_queue_pending ../src/wayland-client.c:2058
    #9 0x74e41b7f70bc in wl_display_dispatch_pending ../src/wayland-client.c:2121
    #10 0x74e41c00b57f in Wayland_PumpEvents /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandevents.c:380
    #11 0x74e41bc88d1f in SDL_PumpEventsInternal /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:918
    #12 0x74e41bc893b0 in SDL_WaitEventTimeout_REAL /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:1093
    #13 0x74e41bc88e76 in SDL_PollEvent_REAL /root/src/sdl/SDL2-2.30.4/src/events/SDL_events.c:960
    #14 0x74e41bc76026 in SDL_PollEvent /root/src/sdl/SDL2-2.30.4/src/dynapi/SDL_dynapi_procs.h:156
    #15 0x5b42e42b70e7 in process_events /root/src/SDLPoP-1.23/src/seg009.c:3353
    #16 0x5b42e42baa38 in idle /root/src/SDLPoP-1.23/src/seg009.c:3698
    #17 0x5b42e425ee3c in show_title /root/src/SDLPoP-1.23/src/seg000.c:1908
    #18 0x5b42e425a954 in start_game /root/src/SDLPoP-1.23/src/seg000.c:235
    #19 0x5b42e426163e in pop_main /root/src/SDLPoP-1.23/src/seg000.c:149
    #20 0x5b42e4252e62 in main /root/src/SDLPoP-1.23/src/main.c:27
    #21 0x74e41b846249  (/lib/x86_64-linux-gnu/libc.so.6+0x27249)
    #22 0x74e41b846304 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27304)
    #23 0x5b42e4253270 in _start (/root/prince/prince.exe+0x2c270)

0x611000191e50 is located 16 bytes inside of 224-byte region [0x611000191e40,0x611000191f20)
freed by thread T0 here:
    #0 0x74e41c6b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    #1 0x74e41bdbaa1a in real_free /root/src/sdl/SDL2-2.30.4/src/stdlib/SDL_malloc.c:5199
    #2 0x74e41bdbae48 in SDL_free_REAL /root/src/sdl/SDL2-2.30.4/src/stdlib/SDL_malloc.c:5339
    #3 0x74e41bf3407d in SDL_DelVideoDisplay /root/src/sdl/SDL2-2.30.4/src/video/SDL_video.c:676
    #4 0x74e41c0254ea in Wayland_free_display /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:756
    #5 0x74e41c0261c3 in display_remove_global /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:892
    #6 0x74e41b7e8f79  (/lib/x86_64-linux-gnu/libffi.so.8+0x6f79)

previously allocated by thread T0 here:
    #0 0x74e41c6b89cf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x74e41bdba9b6 in real_malloc /root/src/sdl/SDL2-2.30.4/src/stdlib/SDL_malloc.c:5196
    #2 0x74e41bdbad2d in SDL_malloc_REAL /root/src/sdl/SDL2-2.30.4/src/stdlib/SDL_malloc.c:5295
    #3 0x74e41c024f61 in Wayland_add_display /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:702
    #4 0x74e41c0258e5 in display_handle_global /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:837
    #5 0x74e41b7e8f79  (/lib/x86_64-linux-gnu/libffi.so.8+0x6f79)

SUMMARY: AddressSanitizer: heap-use-after-free /root/src/sdl/SDL2-2.30.4/src/video/wayland/SDL_waylandvideo.c:757 in Wayland_free_display
Shadow bytes around the buggy address:
  0x0c228002a370: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c228002a380: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c228002a390: fd fd fd fd fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c228002a3a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c228002a3b0: fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa
=>0x0c228002a3c0: fa fa fa fa fa fa fa fa fd fd[fd]fd fd fd fd fd
  0x0c228002a3d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c228002a3e0: fd fd fd fd fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c228002a3f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c228002a400: fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa
  0x0c228002a410: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==13225==ABORTING

Is this what you wanted to see?

ammen99 commented 4 months ago

Yes exactly, the bug is as I described in my previous comment. Feel free to let the sdl devs know about it, just give them the asan report and my explanation and they'll fix it :)

vanfanel commented 4 months ago

@ammen99 Ah, neat! I will let them know!