mate-desktop / mate-notification-daemon

Daemon to display passive pop-up notifications
https://mate-desktop.org
GNU General Public License v2.0
30 stars 26 forks source link

daemon crashes sometimes since we auto start them with session #218

Open raveit65 opened 11 months ago

raveit65 commented 11 months ago

Expected behaviour

No crashes

Actual behaviour

Sometime the daemon crashes and i got an alarm by fedora bugreporting tool. Notification was triggered by DeadBeef music player. I think i switched just to another workspace during the notification was fired up by the daemon.

Program terminated with signal SIGSEGV, Segmentation fault.
warning: Section `.reg-xstate/3941' in core file too small.
#0  g_type_check_instance_is_a (type_instance=type_instance@entry=0x55ee4f419230, iface_type=0x55ee4f43de40 [None]) at ../gobject/gtype.c:4154
Downloading source file /usr/src/debug/glib2-2.76.5-1.fc38.x86_64/redhat-linux-build/../gobject/gtype.c...
4154      check = node && node->is_instantiatable && iface && type_node_conforms_to_U (node, iface, TRUE, FALSE);
[Current thread is 1 (Thread 0x7ff4400dda40 (LWP 3941))]

Thread 1 (Thread 0x7ff4400dda40 (LWP 3941)):
#0  g_type_check_instance_is_a (type_instance=type_instance@entry=0x55ee4f419230, iface_type=0x55ee4f43de40 [None]) at ../gobject/gtype.c:4154
        node = 0xe5894855fa1e0ff0
        iface = 0x55ee4f43de40
#1  0x00007ff44195bbd3 in gdk_monitor_get_geometry (monitor=0x55ee4f419230, geometry=geometry@entry=0x7ffd6d0fc150) at ../gdk/gdkmonitor.c:283
        __inst = 0x55ee4f419230
        __t = <optimized out>
        __r = <optimized out>
        _g_boolean_var_11 = <optimized out>
        __func__ = "gdk_monitor_get_geometry"
#2  0x000055ee4e63014b in notify_stack_shift_notifications (stack=stack@entry=0x55ee4f4d5d80, nw=nw@entry=0x55ee4f4da750, nw_l=nw_l@entry=0x0, init_width=338, init_height=84, nw_x=nw_x@entry=0x7ffd6d0fc1dc, nw_y=0x7ffd6d0fc1d8) at /usr/src/debug/mate-notification-daemon-1.27.1-1.fc38.x86_64/src/daemon/stack.c:295
        workarea = {x = 1330489168, y = 21998, width = 1829749216, height = 32765}
        monitor = {x = 0, y = 0, width = 1103811581, height = 32756}
        positions = <optimized out>
        l = <optimized out>
        x = <optimized out>
        y = <optimized out>
        shiftx = 0
        shifty = 0
        i = <optimized out>
        n_wins = <optimized out>
#3  0x000055ee4e63075b in notify_stack_add_window (stack=0x55ee4f4d5d80, nw=nw@entry=0x55ee4f4da750, new_notification=new_notification@entry=1) at /usr/src/debug/mate-notification-daemon-1.27.1-1.fc38.x86_64/src/daemon/stack.c:395
        req = {width = 338, height = 82}
        x = 32756
        y = 1109725974
#4  0x000055ee4e631282 in notify_daemon_notify_handler (object=<optimized out>, invocation=0x7ff41c002ab0, app_name=<optimized out>, id=<optimized out>, icon=0x55ee4f810aa0 "deadbeef", summary=<optimized out>, body=0x55ee4f4d5f70 "Alden Tyrell - Fabric 73 - Ben Sims", actions=0x55ee4f810ff0, hints=0x7ff41c003ec0, timeout=-1, user_data=0x55ee4f495dd0) at /usr/src/debug/mate-notification-daemon-1.27.1-1.fc38.x86_64/src/daemon/daemon.c:1588
        monitor_id = 0x55ee4f77a270
        pointer = <optimized out>
        screen = 0x55ee4f444be0
        display = <optimized out>
        seat = <optimized out>
        daemon = 0x55ee4f495dd0
        nt = <optimized out>
        nw = 0x55ee4f4da750
        data = 0x7ff4413611ce <__GI___libc_free+126>
        use_pos_data = 0
        new_notification = 1
        x = 0
        y = 0
        return_id = <optimized out>
        sound_file = 0x0
        sound_enabled = 0
        do_not_disturb = <optimized out>
        i = <optimized out>
        pixbuf = <optimized out>
        gsettings = <optimized out>
        fullscreen_window = <optimized out>
        window_xid = 0
#5  0x00007ff4409f1be6 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#6  0x00007ff4409ee4bf in ffi_call_int (cif=cif@entry=0x7ffd6d0fc5e0, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
        classes = {X86_64_INTEGER_CLASS, X86_64_NO_CLASS, 1084175008, 32756}
        stack = <optimized out>
        argp = 0x7ffd6d0fc318 "\006"
        arg_types = <optimized out>
        gprcount = 6
        ssecount = <optimized out>
        ngpr = 1
        nsse = 0
        i = <optimized out>
        avn = <optimized out>
        flags = <optimized out>
        reg_args = <optimized out>
#7  0x00007ff4409f118e in ffi_call (cif=cif@entry=0x7ffd6d0fc5e0, fn=fn@entry=0x55ee4e630b80 <notify_daemon_notify_handler>, rvalue=rvalue@entry=0x7ffd6d0fc560, avalue=avalue@entry=0x7ffd6d0fc4a0) at ../src/x86/ffi64.c:710
        arg_types = 0x7ffd6d0fc500
        i = <optimized out>
        nargs = 11
        max_reg_struct_size = <optimized out>
#8  0x00007ff442230ad2 in g_cclosure_marshal_generic (closure=<optimized out>, return_gvalue=<optimized out>, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>, marshal_data=<optimized out>) at ../gobject/gclosure.c:1536
        rtype = <optimized out>
        rvalue = 0x7ffd6d0fc560
        n_args = <optimized out>
        atypes = <optimized out>
        args = <optimized out>
        i = <optimized out>
        cif = {abi = FFI_UNIX64, nargs = 11, arg_types = 0x7ffd6d0fc500, rtype = 0x7ff4409f3330 <ffi_type_sint32>, bytes = 40, flags = 6}
        cc = <optimized out>
        enum_tmpval = <optimized out>
        tmpval_used = 0
#9  0x00007ff44222a4ea in g_closure_invoke (closure=0x55ee4f48e000, return_value=0x7ffd6d0fc790, n_param_values=10, param_values=0x55ee4f4d63e0, invocation_hint=0x7ffd6d0fc770) at ../gobject/gclosure.c:832
        marshal = 0x7ff442230640 <g_cclosure_marshal_generic>
        marshal_data = 0x0
        in_marshal = 0
        real_closure = 0x55ee4f48dfe0
        __func__ = "g_closure_invoke"
#10 0x00007ff442258e16 in signal_emit_unlocked_R.isra.0 (node=<optimized out>, detail=0, detail@entry=1, instance=0x55ee4f4d1f30, emission_return=0x7ffd6d0fc830, instance_and_params=0x55ee4f4d63e0) at ../gobject/gsignal.c:3812
        tmp = <optimized out>
        handler = 0x55ee4f48dd60
        accumulator = 0x55ee4f496780
        emission = {next = 0x0, instance = 0x55ee4f4d1f30, ihint = {signal_id = 196, detail = 0, run_type = (G_SIGNAL_RUN_FIRST | G_SIGNAL_ACCUMULATOR_FIRST_RUN)}, state = EMISSION_RUN, chain_type = 0x4 [None]}
        handler_list = 0x55ee4f48dd60
        return_accu = 0x7ffd6d0fc790
        accu = {g_type = 0x14 [None], data = {{v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}}
        signal_id = 196
        max_sequential_handler_number = 329
        return_value_altered = <optimized out>
#11 0x00007ff442246e6a in g_signal_emitv (instance_and_params=instance_and_params@entry=0x55ee4f4d63e0, signal_id=signal_id@entry=196, detail=1, detail@entry=0, return_value=return_value@entry=0x7ffd6d0fc830) at ../gobject/gsignal.c:3284
        instance = <optimized out>
        node = <optimized out>
        __func__ = "g_signal_emitv"
#12 0x000055ee4e62e47b in _notify_daemon_notifications_skeleton_handle_method_call (connection=<optimized out>, sender=<optimized out>, object_path=<optimized out>, interface_name=0x7ff41c009900 "org.freedesktop.Notifications", method_name=0x7ff41c002030 "Notify", parameters=<optimized out>, invocation=0x7ff41c002ab0, user_data=0x55ee4f4d1f30) at /usr/src/debug/mate-notification-daemon-1.27.1-1.fc38.x86_64/src/daemon/mnd-dbus-generated.c:1737
        skeleton = 0x55ee4f4d1f30
        info = 0x55ee4e636400 <_notify_daemon_notifications_method_info_notify>
        iter = {x = {140686418517984, 8, 8, 0, 94482021031728, 0, 1, 3579507750, 140726433204512, 140687058417764, 0, 140687058417837, 140687046242494, 140726433204576, 140687046242494, 0}}
        child = 0x0
        paramv = 0x55ee4f4d63e0
        num_params = <optimized out>
        n = <optimized out>
        signal_id = 196
        return_value = {g_type = 0x14 [None], data = {{v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}}
        __func__ = "_notify_daemon_notifications_skeleton_handle_method_call"
#13 0x00007ff44185b7e3 in g_dbus_interface_method_dispatch_helper (interface=<optimized out>, method_call_func=0x55ee4e62e290 <_notify_daemon_notifications_skeleton_handle_method_call>, invocation=0x7ff41c002ab0) at ../gio/gdbusinterfaceskeleton.c:618
        has_handlers = <optimized out>
        has_default_class_handler = <optimized out>
        emit_authorized_signal = <optimized out>
        run_in_thread = <optimized out>
        flags = <optimized out>
        object = 0x0
        __func__ = "g_dbus_interface_method_dispatch_helper"
#14 0x00007ff44183d908 in call_in_idle_cb (user_data=user_data@entry=0x7ff41c002ab0) at ../gio/gdbusconnection.c:5012
        invocation = 0x7ff41c002ab0
        vtable = <optimized out>
        registration_id = <optimized out>
        subtree_registration_id = <optimized out>
        ei = 0x55ee4f4d7650
        es = 0x0
        __func__ = "call_in_idle_cb"
#15 0x00007ff4416414fd in g_idle_dispatch (source=0x7ff41c007d70, callback=0x7ff44183d7f0 <call_in_idle_cb>, user_data=0x7ff41c002ab0) at ../glib/gmain.c:6163
        idle_source = 0x7ff41c007d70
        again = <optimized out>
#16 0x00007ff4416454fc in g_main_dispatch (context=0x55ee4f46e3e0) at ../glib/gmain.c:3460
        dispatch = 0x7ff4416414d0 <g_idle_dispatch>
        prev_source = 0x0
        begin_time_nsec = 2509041208679
        was_in_call = 0
        user_data = 0x7ff41c002ab0
        callback = 0x7ff44183d7f0 <call_in_idle_cb>
        cb_funcs = 0x7ff441730380 <g_source_callback_funcs>
        cb_data = 0x7ff41c008fe0
        need_destroy = <optimized out>
        source = 0x7ff41c007d70
        current = 0x55ee4f48a070
        i = 1
#17 g_main_context_dispatch (context=0x55ee4f46e3e0) at ../glib/gmain.c:4200
#18 0x00007ff4416a36b8 in g_main_context_iterate.isra.0 (context=0x55ee4f46e3e0, block=1, dispatch=1, self=<optimized out>) at ../glib/gmain.c:4276
        max_priority = 0
        timeout = 0
        some_ready = 1
        nfds = 3
        allocated_nfds = <optimized out>
        fds = <optimized out>
        begin_time_nsec = 2509041196837
#19 0x00007ff441644aff in g_main_loop_run (loop=0x55ee4f4d6360) at ../glib/gmain.c:4479
        __func__ = "g_main_loop_run"
#20 0x00007ff441c06975 in gtk_main () at ../gtk/gtkmain.c:1329
        loop = 0x55ee4f4d6360
#21 0x000055ee4e62be19 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/mate-notification-daemon-1.27.1-1.fc38.x86_64/src/daemon/mnd-daemon.c:107
        daemon = 0x55ee4f495dd0

I can upload full stacktrace if requested. I think the issue was still there in code before, but we never saw it with dbus-activation. Good is that dbus-activation starts the daemon again after a crash.

Steps to reproduce the behaviour

  1. using a music-player which sends notifications
  2. switching the workspace during notification was send.

MATE general version

1.27.x

Package version

1.27.x

Linux Distribution

fedora 38

Link to bugreport of your Distribution (requirement)

1.27.x isn't released

lukefromdc commented 11 months ago

Confirmed here: with compiz I can just keep the cube spinning while Audacious changes songs and I will get the segfault. Cause may be an invalid value for what workspace to send the notification to

lukefromdc commented 11 months ago

Also note this happens on wayland too, and in that case the notification daemon stays down (no session manager to relaunch it.)

raveit65 commented 11 months ago

....with compiz I can just keep the cube spinning while Audacious changes songs and I will get the segfault.

very cool reproducer :)

raveit65 commented 10 months ago

Sometimes the stacktrace is shorter but it looks like the same issue in code.

#0  g_type_check_instance_is_a (type_instance=type_instance@entry=0x55a0d46c9230, iface_type=0x55a0d46ede40 [None]) at ../gobject/gtype.c:4154
4154      check = node && node->is_instantiatable && iface && type_node_conforms_to_U (node, iface, TRUE, FALSE);
[Current thread is 1 (Thread 0x7f574ebcaa40 (LWP 3927))]

Thread 1 (Thread 0x7f574ebcaa40 (LWP 3927)):
#0  g_type_check_instance_is_a (type_instance=type_instance@entry=0x55a0d46c9230, iface_type=0x55a0d46ede40 [None]) at ../gobject/gtype.c:4154
        node = 0x300002643260c
        iface = 0x55a0d46ede40
#1  0x00007f5750cedbd3 in gdk_monitor_get_geometry (monitor=0x55a0d46c9230, geometry=geometry@entry=0x7ffe0d8202e0) at ../gdk/gdkmonitor.c:283
        __inst = 0x55a0d46c9230
        __t = <optimized out>
        __r = <optimized out>
        _g_boolean_var_11 = <optimized out>
        __func__ = "gdk_monitor_get_geometry"
#2  0x000055a0d3fd214b in notify_stack_shift_notifications (stack=stack@entry=0x55a0d4785730, nw=nw@entry=0x0, nw_l=nw_l@entry=0x0, init_width=init_width@entry=0, init_height=init_height@entry=0, nw_x=nw_x@entry=0x0, nw_y=0x0) at /usr/src/debug/mate-notification-daemon-1.27.1-3.fc38.x86_64/src/daemon/stack.c:295
        workarea = {x = 1355982848, y = 32599, width = 1344323037, height = 32599}
        monitor = {x = 226624352, y = 32766, width = 1325521063, height = 32599}
        positions = <optimized out>
        l = <optimized out>
        x = <optimized out>
        y = <optimized out>
        shiftx = 0
        shifty = 0
        i = <optimized out>
        n_wins = <optimized out>
#3  0x000055a0d3fd2633 in update_position (stack=0x55a0d4785730) at /usr/src/debug/mate-notification-daemon-1.27.1-3.fc38.x86_64/src/daemon/stack.c:368
#4  update_position_idle (stack=stack@entry=0x55a0d4785730) at /usr/src/debug/mate-notification-daemon-1.27.1-3.fc38.x86_64/src/daemon/stack.c:373
#5  0x00007f575013c4fd in g_idle_dispatch (source=0x55a0d4a1fec0, callback=0x55a0d3fd2610 <update_position_idle>, user_data=0x55a0d4785730) at ../glib/gmain.c:6163
        idle_source = 0x55a0d4a1fec0
        again = <optimized out>
#6  0x00007f57501404fc in g_main_dispatch (context=0x55a0d4713e20) at ../glib/gmain.c:3460
        dispatch = 0x7f575013c4d0 <g_idle_dispatch>
        prev_source = 0x0
        begin_time_nsec = 15485636763440
        was_in_call = 0
        user_data = 0x55a0d4785730
        callback = 0x55a0d3fd2610 <update_position_idle>
        cb_funcs = 0x7f575022b380 <g_source_callback_funcs>
        cb_data = 0x55a0d48aba60
        need_destroy = <optimized out>
        source = 0x55a0d4a1fec0
        current = 0x55a0d4739760
        i = 0
#7  g_main_context_dispatch (context=0x55a0d4713e20) at ../glib/gmain.c:4200
#8  0x00007f575019e6b8 in g_main_context_iterate.isra.0 (context=0x55a0d4713e20, block=1, dispatch=1, self=<optimized out>) at ../glib/gmain.c:4276
        max_priority = 200
        timeout = 0
        some_ready = 1
        nfds = 3
        allocated_nfds = <optimized out>
        fds = <optimized out>
        begin_time_nsec = 15485636759963
#9  0x00007f575013faff in g_main_loop_run (loop=0x55a0d4785d40) at ../glib/gmain.c:4479
        __func__ = "g_main_loop_run"
#10 0x00007f5750606975 in gtk_main () at ../gtk/gtkmain.c:1329
        loop = 0x55a0d4785d40
#11 0x000055a0d3fcde19 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/mate-notification-daemon-1.27.1-3.fc38.x86_64/src/daemon/mnd-daemon.c:107
        daemon = 0x55a0d47454f0
zhuyaliang commented 10 months ago

@raveit65 I didn't repeat this problem. I switched the work area when sending notifications circularly

#!/bin/bash
for ((i=1; i<=100; i++))
do
    notify-send "ssssssssssssssssssss"
    sleep 1
done
raveit65 commented 10 months ago

On fedora with abrt (automatic bug-watch and bugreporting system) enable i got minimum 5 crashes per week. Also other users report it at redhat bugzila. https://bugzilla.redhat.com/show_bug.cgi?id=2192236 https://bugzilla.redhat.com/show_bug.cgi?id=2216910 https://bugzilla.redhat.com/show_bug.cgi?id=2238182 Those backtraces are looking a bit different but i' am sure they have all the same cause. Well, workspace switching was a reproducer for me.

Ignore this in rhbz reports.

Description of problem:
Just logged into the desktop.

This is because abrt report all crashes from last session in the beginning of a new session. .... not a helpful setting for developers.

joakim-tjernlund commented 1 week ago

I am seeing these mate-notification-daemon crashes still in latest MATE 1.28 The daemon does NOT restart after crash and my keyboard is DEAD. Is that expected?

laptop with USB-C dock with an external monitor. Make the external monitor Primary and orient it above the laptop screen. Pull the USB-C cable and now keyboard is dead.

joakim-tjernlund commented 1 week ago
(gdb) bt
#0  g_type_check_instance_is_a (type_instance=type_instance@entry=0x556d3318e1f0, iface_type=0x556d331c5ad0 [GdkMonitor]) at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/gobject/gtype.c:4172
#1  0x00007f1658327912 in gdk_monitor_get_geometry (monitor=0x556d3318e1f0, geometry=geometry@entry=0x7ffed5eed030) at /var/tmp/portage/x11-libs/gtk+-3.24.42-r1/gtk+-3.24.42/gdk/gdkmonitor.c:283
#2  0x0000556d32c10ebf in notify_stack_shift_notifications (stack=stack@entry=0x556d332f3650, nw=nw@entry=0x556d33286080 [GtkWindow], nw_l=nw_l@entry=0x0, init_width=438, init_height=85, nw_x=nw_x@entry=0x7ffed5eed0a8, nw_y=0x7ffed5eed0ac)
    at stack.c:295
#3  0x0000556d32c116fc in notify_stack_add_window (stack=0x556d332f3650, nw=nw@entry=0x556d33286080 [GtkWindow], new_notification=new_notification@entry=1) at stack.c:395
#4  0x0000556d32c0faa9 in notify_daemon_notify_handler
    (object=0x556d332ce340, invocation=0x7f163c0053c0 [GDBusMethodInvocation], app_name=<optimized out>, id=<optimized out>, icon=0x556d3320b310 "gpm-battery-100", summary=<optimized out>, body=0x556d332debc0 "Laptop battery discharging (95%)", actions=0x556d332f3290, hints=0x7f163c0031d0, timeout=30000, user_data=0x556d33309b80) at daemon.c:1588
#5  0x0000556d32c0ca11 in _g_dbus_codegen_marshal_BOOLEAN__OBJECT_STRING_UINT_STRING_STRING_STRING_BOXED_VARIANT_INT
    (invocation_hint=<optimized out>, marshal_data=0x0, param_values=0x556d331c9d10, n_param_values=<optimized out>, return_value=0x7ffed5eed350, closure=0x556d332ca460) at mnd-dbus-generated.c:324
#10 Python Exception <class 'gdb.error'>: There is no member named v_pointer.
#11 0x0000556d32c0d1ab in _notify_daemon_notifications_skeleton_handle_method_call
    (connection=<optimized out>, sender=<optimized out>, object_path=<optimized out>, interface_name=0x7f163c0024a0 "org.freedesktop.Notifications", method_name=0x7f163c007600 "Notify", parameters=<optimized out>, invocation=0x7f163c0053c0 [GDBusMethodInvocation], user_data=0x556d332ce340) at mnd-dbus-generated.c:2098
#12 0x00007f165793f682 in g_dbus_interface_method_dispatch_helper (interface=<optimized out>, method_call_func=0x556d32c0cff0 <_notify_daemon_notifications_skeleton_handle_method_call>, invocation=0x7f163c0053c0 [GDBusMethodInvocation])
    at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/gio/gdbusinterfaceskeleton.c:620
#13 0x00007f1657924a58 in call_in_idle_cb (user_data=0x7f163c0053c0) at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/gio/gdbusconnection.c:5453
#14 0x00007f16577293b9 in g_main_dispatch (context=context@entry=0x556d331e64a0) at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/glib/gmain.c:3476
#15 0x00007f165772c527 in g_main_context_dispatch_unlocked (context=0x556d331e64a0) at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/glib/gmain.c:4284
#16 g_main_context_iterate_unlocked (context=0x556d331e64a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/glib/gmain.c:4349
#17 0x00007f165772cddf in g_main_loop_run (loop=0x556d332919b0) at /var/tmp/portage/dev-libs/glib-2.78.6/glib-2.78.6/glib/gmain.c:4551
#18 0x00007f1657bf042d in gtk_main () at /var/tmp/portage/x11-libs/gtk+-3.24.42-r1/gtk+-3.24.42/gtk/gtkmain.c:1329
#19 0x0000556d32c0bd08 in main (argc=<optimized out>, argv=<optimized out>) at mnd-daemon.c:107
joakim-tjernlund commented 1 week ago
        /* If the "use-active-monitor" gsettings key is set to TRUE, then
         * get the monitor the pointer is at. Otherwise, get the monitor
         * number the user has set in gsettings. */
        if (g_settings_get_boolean(daemon->gsettings, GSETTINGS_KEY_USE_ACTIVE_MONITOR))
        {
            gint coordinate_x, coordinate_y;

            display = gdk_display_get_default ();
            seat = gdk_display_get_default_seat (display);
            pointer = gdk_seat_get_pointer (seat);

            gdk_device_get_position (pointer,
                                     &screen,
                                     &coordinate_x,
                                     &coordinate_y);
            monitor_id = gdk_display_get_monitor_at_point (gdk_screen_get_display (screen),
                                                           coordinate_x,
                                                           coordinate_y);
        }
        else
        {
            screen = gdk_display_get_default_screen(gdk_display_get_default());
            monitor_id = gdk_display_get_monitor (gdk_display_get_default(),
                                  g_settings_get_int(daemon->gsettings, GSETTINGS_KEY_MONITOR_NUMBER));
        }

        if (_gtk_get_monitor_num (monitor_id) >= daemon->screen->n_stacks)
        {
            /* screw it - dump it on the last one we'll get
             a monitors-changed signal soon enough*/
            monitor_id = gdk_display_get_monitor (gdk_display_get_default(), (int) daemon->screen->n_stacks - 1);
        }

        notify_stack_add_window (daemon->screen->stacks[_gtk_get_monitor_num (monitor_id)], nw, new_notification);

Here I think it gets confused:

(gdb) print daemon->screen->n_stacks
$11 = 1
gdb) print *monitor_id
$18 = {parent = {g_type_instance = {g_class = 0x556d331cad40 [g_type: GdkX11Monitor/GdkMonitor]}, ref_count = 1, qdata = 0x0}, display = 0x556d331c6290 [GdkX11Display], manufacturer = 0x556d3318e640 "CMN", model = 0x556d3318e2f0 "eDP-1", 
  connector = 0x556d3318e620 "eDP-1", geometry = {x = 0, y = 0, width = 1920, height = 1200}, width_mm = 301, height_mm = 188, scale_factor = 1, refresh_rate = 60002, subpixel_layout = GDK_SUBPIXEL_LAYOUT_UNKNOWN}

From here I am lost

joakim-tjernlund commented 1 week ago

Here is an idea, seems like on_screen_monitors_changed() is busted when deleting a monitor:

else if (n_monitors < (int) nscreen->n_stacks)
    {
        NotifyStack* last_stack;

        last_stack = nscreen->stacks[n_monitors - 1];

        /* transfer items before removing stacks */
        for (i = n_monitors; i < (int) nscreen->n_stacks; i++)
        {
            NotifyStack* stack = nscreen->stacks[i];
            GList* windows = g_list_copy(notify_stack_get_windows(stack));
            GList* l;

            for (l = windows; l != NULL; l = l->next)
            {
                /* skip removing the window from the old stack since it will try
                 * to unrealize the window.
                 * And the stack is going away anyhow. */
                notify_stack_add_window(last_stack, l->data, TRUE);
            }

            g_list_free(windows);
            notify_stack_destroy(stack);
            nscreen->stacks[i] = NULL;
        }

        /* remove the extra stacks */
        nscreen->stacks = g_renew(NotifyStack*, nscreen->stacks, (gsize) n_monitors);
        nscreen->n_stacks = (gsize) n_monitors;
    }

This just deletes the last monitor in the stacks array, not the one that actually got disconnected. I don't know much about gtk+ etc. to fix it, anyone else?

lukefromdc commented 1 week ago

mate-notification-daemon has nothing to do with the keyboard problem you are getting, it never interacts with it. Sounds like you may have some kind of USB issue possibly in the kernel. Likely cause would be something specific to your make and model. Kernel devs try to keep up with all the errata in laptop components but it seems there is always more. At some point a newer kernel might fix the keyboard issue.

The notification daemon crashing is a separate issue, since its crashing when you remove the second monitor, try as a test setting the notification popup position to stay on the laptop's screen. Also try "use active monitor." Its entirely possible the notification daemon is still trying to find the monitor that is disconnected when you unplug the dock. If the crash is unconditional when one of two monitors is disconnected, than indeed the wrong monitor is being deleted from the data m-s-d is expecting to find.

joakim-tjernlund commented 1 week ago

Also the mouse, I can move the cursor but clicking on anything does nothing. This problem has been around for a long time and KDE does not have this problem.

I did try changing "use active monitor" and "monitor number" but it didn't help. What do you think about on_screen_monitors_changed() and the deleting part?

lukefromdc commented 1 week ago

Your gdb printout from (gdb) print *monitor_id does in fact find a valid monitor

$18 = {parent = {g_type_instance = {g_class = 0x556d331cad40 [g_type: GdkX11Monitor/GdkMonitor]}, ref_count = 1, qdata = 0x0}, display = 0x556d331c6290 [GdkX11Display], manufacturer = 0x556d3318e640 "CMN", model = 0x556d3318e2f0 "eDP-1", 
  connector = 0x556d3318e620 "eDP-1", geometry = {x = 0, y = 0, width = 1920, height = 1200}, width_mm = 301, height_mm = 188, scale_factor = 1, refresh_rate = 60002, subpixel_layout = GDK_SUBPIXEL_LAYOUT_UNKNOWN}

just the subpixel layout is unknown which is probably not an issue given the monitor ever works.

We are finding ONE monitor so presumably this is after the "extra monitors" are removed from the stack. You are getting the crash unconditionally, so that could well be something in on_screen_monitors_changed() and the code you posted looks like it would do the wrong thing under some conditions, but presumably the last monitor added (the external one) will be the last one on stack and thus first deleted.

This could potentially break down if the laptop started with both attached, both were active through the boot process and both on by default when the xserver (or wayland compositor in that case) started, never passing through a point where only the internal monitor was running. In that case monitors could be added in either order. Hot-switching a desktop between two monitors by going A->A+B->B would be likely to trigger this every time if the code is in fact removing the last monitor.

The issue you are getting with the mouse if it is only in MATE and with the same kernel when testing KDE suggests marco is getting bad data that other window managers and/or hardware compositors do not get or is able to ignore. I can freely connect and disconnect monitors without the mouse ever becoming unresponsive so cannot duplicate that. Since I build my own kernels it is almost impossible that we are using the same kernel though.

Neither the mouse nor the keyboard issues could possibly have anything to do with the notification daemon but this does suggest some kernel issues. If you are testing KDE from a different install, you could be using a different kernel build or version, raising the possiblity marco and software compositing are not causing that either.

joakim-tjernlund commented 1 week ago

The problem has existed for many kernel revisions(6.6.x), possibly always, and we have both KDE and MATE installed in the same machine. The monitors are connected from boot and the external is made Primary. w.r.t keyboard, I can switch to a new virt console(Ctrl+Alt+Fx) and there the keyboard works.

To me this is problem in MATE(as KDE works) and we tested on two different laptops(both Lenovo though) I suspect it is connected to USB-C somehow, we had these mechanical docks before(no USB-C) and these worked.

It is a pity though that the mate-notification-daemon crash probably isn't connected as that is the only clue I got. I guess it could be marco deadlocking somehow.

BTW, you tested with the external monitor as Primary as well?

lukefromdc commented 1 week ago

I was on a desktop w two monitors. My current laptop won'r run an external monitor at all it seems. Also all my work is bare metal on live systems so I don't test any config.change that could force a reinstall from the backup partition or remaking all user config files due to risk. I have never worked wirh VM's.

I do not own any hardware that uses a dock. Someone that can duplicate this will have ro work on the keyboard and mouse problems, no further ideas on that from my end.

joakim-tjernlund commented 1 week ago

If I just pull the DP cable to my external (Primary monitor) I get the same problem. Also, strace on marco:

strace -p 2878
strace: Process 2878 attached
restart_syscall(<... resuming interrupted read ...>

marco just hangs in a system call

joakim-tjernlund commented 1 week ago

Seems like monitor placement(mate-display-properties) is important. Above or to the Left gets me trouble

lukefromdc commented 1 week ago

Above or to the left triggers the crash? This suggests that the removed external monitor either is being configured to put it's top left corner above and to the left of the internal monitor, and that these coordinates are not being updated when the monitor is changed. What is the configuration position of the external monitor?

joakim-tjernlund commented 1 week ago

Above or to the left triggers the crash? This suggests that the removed external monitor either is being configured to put it's top left corner above and to the left of the internal monitor, and that these coordinates are not being updated when the monitor is changed. What is the configuration position of the external monitor?

Yes, directly above or due left, both are 1920x1200 so they align. Not sure what you mean with "configuration position" other than what I have stated, can you be more specific?

lukefromdc commented 1 week ago

If you had dissimilar monitors or different starting positions (such as one above/below or left/right of the other), positions valid in one monitor would be invalid in the other. With two identical outputs starting at 0,0 this is not the case

joakim-tjernlund commented 6 days ago

If I move the external monitor Above and a tiny bit to the Right so it doe not align any more it works fine!

joakim-tjernlund commented 6 days ago

Feels like x axis <= 0 on primary monitor triggers this bug. Where in MATE would this be handled ?

lukefromdc commented 6 days ago

this looks like we need to catch the zero/negative case in mate-settings-daemon itself. If the popup positioning code were to say, set all negative values to zero and then attempt to divide by that value, we would get a crash but probably with a different backtrace Forcing a 1,1 minimum positive value for popup position would block that particular crash.

joakim-tjernlund commented 6 days ago

this looks like we need to catch the zero/negative case in mate-settings-daemon itself. If the popup positioning code were to say, set all negative values to zero and then attempt to divide by that value, we would get a crash but probably with a different backtrace Forcing a 1,1 minimum positive value for popup position would block that particular crash.

You want to give it a try? I have not clue how mate-settings-daemon work.

lukefromdc commented 5 days ago

At the moment I am rather saturated with other things so cannot guarantee time for it. Note that I do not own two identical monitors for any machine, so might not be able to duplicate this