nwg-piotr / nwg-panel

GTK3-based panel for sway and Hyprland Wayland compositors
MIT License
629 stars 43 forks source link

[BUG] Playerctl freezes with ytmdesktop and another audio source #233

Open ZetaArashi opened 1 year ago

ZetaArashi commented 1 year ago

Describe the bug Multiple Audio Sources on one workspace in Hyperland appear to cause nwg-panel to hang when Playerctl module is present, with 100% CPU usage on a single thread. Closing the WM does not remove the issue, and killing hangs as well. I've tested this, and it seems to be specific to the application I was using. I don't really expect anybody to fix this -- and I'm okay with just not using the playerctl module -- but I thought you should know the interaction is present at least.

Applications: ytmdesktop (from Flathub), VLC. (I'm unable to test YTM from the AUR, it doesn't compile for me at the moment. Much sadness.)

To Reproduce Steps to reproduce the behavior:

  1. Open ytmdesktop. Playing audio is not required.
  2. Open a VLC video on the same workspace
  3. Play audio from VLC.
  4. Note high CPU usage, and that nwg-panel has now frozen.

Expected behavior nwg-panel will continue to function as normal.

Desktop (please complete the following information):

Additional context Like I said, completely understand this is specific to one application, and probably something weird is going on with that app. I just thought you should be aware of it in case someone else runs into it.

nwg-piotr commented 1 year ago

Well, I couldn't reproduce it on my side (panel behaved normally), but actually ytmdesktop refused to fully start on my machine. I didn't dig into it much, as I would never use a program that requires electron13 and python2. Honestly, I also don't use a desktop app if I can run something in the browser.

uninsane commented 3 months ago

i'm seeing similar symptoms: one CPU core pegged to 100% and the panel being unresponsive (though notably it still appears to be running the executors and such as scheduled, the panel just never redraws nor does it trigger any of the expected actions when i click on panel items).

this triggers for me any time i close media. this includes mpv, but also playing a video in Firefox and then closing the tab (this one does it).

building with debug symbols and running it under gdb shows:

Thread 1 ".nwg-panel-wrap" received signal SIGSEGV, Segmentation fault.
0x00007ffff6da2529 in g_type_check_instance () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
(gdb) bt
#0  0x00007ffff6da2529 in g_type_check_instance () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#1  0x00007ffff6d9168b in signal_emit_valist_unlocked () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#2  0x00007ffff6d982d2 in g_signal_emit_valist () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#3  0x00007ffff6d9837f in g_signal_emit () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#4  0x00007fffed8ef436 in dbus_name_owner_changed_callback (proxy=<optimized out>, sender_name=<optimized out>, signal_name=<optimized out>, parameters=<optimized out>, data=0x10ae3a0) at ../playerctl/playerctl-player-manager.c:295
#5  0x00007ffff6d7c668 in g_closure_invoke () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#6  0x00007ffff6d90bdc in signal_emit_unlocked_R.isra.0 () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#7  0x00007ffff6d92571 in signal_emit_valist_unlocked () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#8  0x00007ffff6d982d2 in g_signal_emit_valist () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#9  0x00007ffff6d9837f in g_signal_emit () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgobject-2.0.so.0
#10 0x00007ffff5f3da25 in on_signal_received () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgio-2.0.so.0
#11 0x00007ffff5f2a02b in emit_signal_instance_in_idle_cb () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libgio-2.0.so.0
#12 0x00007ffff6e5de59 in g_main_dispatch () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libglib-2.0.so.0
#13 0x00007ffff6e60ff7 in g_main_context_iterate_unlocked.isra () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libglib-2.0.so.0
#14 0x00007ffff6e618af in g_main_loop_run () from /nix/store/nm9608b5y801fq2p73nl7k80z8kcbmh2-glib-2.80.2/lib/libglib-2.0.so.0
#15 0x00007fffedc068c5 in gtk_main () from /nix/store/6ivp7s6qwf02d3siggfjrh3ayf2y056v-gtk+3-3.24.42/lib/libgtk-3.so.0
#16 0x00007ffff7e3c052 in ffi_call_unix64 () from /nix/store/nj9g42fdsm8l2z43kfcahch3px2q209a-libffi-3.4.6/lib/libffi.so.8
#17 0x00007ffff7e39ee5 in ffi_call_int () from /nix/store/nj9g42fdsm8l2z43kfcahch3px2q209a-libffi-3.4.6/lib/libffi.so.8
#18 0x00007ffff7e3aad8 in ffi_call () from /nix/store/nj9g42fdsm8l2z43kfcahch3px2q209a-libffi-3.4.6/lib/libffi.so.8
#19 0x00007ffff6f7dbc3 in pygi_invoke_c_callable () from /nix/store/ak80bykk8halwf1klflhvpncq9lw2f8s-python3.11-pygobject-3.48.2/lib/python3.11/site-packages/gi/_gi.cpython-311-x86_64-linux-gnu.so
#20 0x00007ffff6f7fa58 in pygi_function_cache_invoke () from /nix/store/ak80bykk8halwf1klflhvpncq9lw2f8s-python3.11-pygobject-3.48.2/lib/python3.11/site-packages/gi/_gi.cpython-311-x86_64-linux-gnu.so
#21 0x00007ffff7a4ba2e in PyObject_Call () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#22 0x00007ffff78fa99c in _PyEval_EvalFrameDefault () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#23 0x00007ffff7b2bd3c in _PyEval_Vector.constprop.0 () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#24 0x00007ffff7b2beda in PyEval_EvalCode () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#25 0x00007ffff7b56ab0 in run_mod () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#26 0x00007ffff7b767f2 in _PyRun_SimpleFileObject () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#27 0x00007ffff7b77091 in _PyRun_AnyFileObject () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#28 0x00007ffff7b799af in Py_RunMain () from /nix/store/6b1fqdwb3g56j5pazv8zkx9qd0mv3wiz-python3-3.11.9/lib/libpython3.11.so.1.0
#29 0x00007ffff763c10e in __libc_start_call_main (main=main@entry=0x401040 <main>, argc=argc@entry=2, argv=argv@entry=0x7fffffffc208) at ../sysdeps/nptl/libc_start_call_main.h:58
#30 0x00007ffff763c1c9 in __libc_start_main_impl (main=0x401040 <main>, argc=2, argv=0x7fffffffc208, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffc1f8) at ../csu/libc-start.c:360
#31 0x0000000000401075 in _start ()

that's pretty noisy (i tried to get a python backtrace with py-bt, but that seems to not work across the FFI boundary), but what stands out is dbus_name_owner_changed_callback in playerctl/playerctl-player-manager.c:295. if you're hitting this, you should see the log message player name vanished: ... immediately before it hangs.

g_signal_emit(manager, connection_signals[NAME_VANISHED], 0, player_name);

i guess something is misconfiguring the signals. not sure if that's playerctl or nwg-panel or something in-between, though i don't see any open bugs related to this in the playerctl repos. other bars like waybar probably don't hit this simply because they run playerctl in a separate process (i think?). playerctl looks mostly dead, so it's possible some operating systems patch it or build a fork of it. @nwg-piotr if you share your OS name i can check if maybe that's the case, and this should be reported to EndeavorOS instead?

my own OS is NixOS, with sway.

nwg-piotr commented 3 months ago

I'm on Arch.

Does it happen on your side on the development version? The Playerctl module has been altered recently, including the on_player_vanished method. NixOS seems to still use v0.9.32.

uninsane commented 3 months ago

Arch looks to be running stock playerctl 2.4.1, same as NixOS. so that's probably not it then. (gentoo and alpine aren't applying any patches either, FWIW)

NixOS is actually on nwg-panel 0.9.34 right now, which should include your playerctl changes.

nwg-piotr commented 3 months ago

I'll take another look at the issue when I'm done with what I'm working on right now.

uninsane commented 3 months ago

a bit more info:

$ gdb nwg-panel
(gdb) break dbus_name_owner_changed_callback
(gdb) run

# load a video in firefox
# hit continue as it hits the breakpoint a few times, until it's no longer breaked
# close the firefox tab

Thread 1 ".nwg-panel-wrap" hit Breakpoint 1, dbus_name_owner_changed_callback (proxy=0x10a6420, sender_name=0x10cd1d0 "org.freedesktop.DBus", signal_name=0x1018620 "NameOwnerChanged", parameters=0x7fffdc00e9c0, data=0x1036ba0) at ../playerctl/playerctl-player-manager.c:244
(gdb) break manager_remove_managed_player_by_name
(gdb) continue
Thread 1 ".nwg-panel-wrap" hit Breakpoint 1, dbus_name_owner_changed_callback (proxy=0x10a6420, sender_name=0x10cd1d0 "org.freedesktop.DBus", signal_name=0x1018620 "NameOwnerChanged", parameters=0x7fffdc00e9c0, data=0x1036ba0) at ../playerctl/playerctl-player-manager.c:244
(gdb) step
228 in ../playerctl/playerctl-player-manager.c
(gdb) next
229 in ../playerctl/playerctl-player-manager.c
(gdb) next
230 in ../playerctl/playerctl-player-manager.c
(gdb) next
232 in ../playerctl/playerctl-player-manager.c
(gdb) next
233 in ../playerctl/playerctl-player-manager.c
(gdb) next
234 in ../playerctl/playerctl-player-manager.c
(gdb) next
235 in ../playerctl/playerctl-player-manager.c
(gdb) print *manager
$15 = {parent_instance = {g_type_instance = {g_class = 0xbf1c80}, ref_count = 1, qdata = 0xfafe60}, priv = 0x1036b50}
(gdb) next
236 in ../playerctl/playerctl-player-manager.c
(gdb) print *manager
$16 = {parent_instance = {g_type_instance = {g_class = 0xaaaaaaaaaaaaaaaa}, ref_count = 2863311530, qdata = 0xaaaaaaaaaaaaaaaa}, priv = 0xaaaaaaaaaaaaaaaa}

something corrupts the PlayerctlPlayerManager instance when it calls g_signal_emit(manager, connection_signals[PLAYER_VANISHED], 0, player); (line 235). perhaps there's some incorrect refcounting somewhere and it gets free'd too early.

uninsane commented 3 months ago

github won't let me attach a .patch file, but this bug does go away for me if i patch playerctl to acquire a ref on the manager inside dbus_name_owner_changed_callback before it uses the manager.

https://git.uninsane.org/colin/playerctl/commit/bbcbbe4e03da93523b431ffee5b64e10b17b4f9f.patch

i don't have high hopes for me being able to upstream this fix, given the state of playerctl. if it really is a playerctl bug (idk enough about glib to know who's supposed to be responsible for refcounting when casting userdata like this), we might have to live with it and just grab extra refs inside nwg-panel in the signal handlers.

nwg-piotr commented 3 months ago

If you have an idea on how to handle it on the panel side, feel free to submit.