Betterbird / thunderbird-patches

Betterbird is a fork of Mozilla Thunderbird. Here are the patches that provide all the goodness.
Other
455 stars 20 forks source link

Segmentation fault/crash when using system tray notification under Gnome/Wayland #266

Closed DutchFlander closed 6 months ago

DutchFlander commented 6 months ago

Hi,

I've copied my thunderbird profile to a new place and using Betterbird with it. It is much faster and no hangups like in TB, but from time-to-time I'm experiencing sudden quits. I tried to run it from a console, so far this is what I can see:

~$ ~/Applications/betterbird/betterbird
Opening in existing browser session.
Gdk-Message: 12:00:39.386: Unable to load split_h from the cursor theme
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:06:41.200: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:15:32.743: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:15:32.767: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:15:32.795: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:15:38.759: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:15:39.216: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
[Parent 208998, Main Thread] WARNING: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed: 'glib warning', file /home/betterbird/build115/mozilla-esr115/toolkit/xre/nsSigHandlers.cpp:167

(betterbird:208998): Gtk-CRITICAL **: 12:15:39.325: gtk_widget_get_scale_factor: assertion 'GTK_IS_WIDGET (widget)' failed
Opening in existing browser session.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Segmentation fault

Is there any other method to have more detailed debug output or log to find out what should be the issue here?

Version used: Betterbird 115.7.0-bb23 I use it with Owl for Exchange extension OS: Debian 12.4 Linux *** 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux The package I'm using is not built, but downloaded and extracted from linux archive: https://www.betterbird.eu/downloads/get.php?os=linux&lang=en-US&version=release

Usually the issue comes after a few hours randomly and without any warning it just quits.

Thanks for the help in advance, Dutch

Betterbird commented 6 months ago

Likely related to the system tray and Gnome, see:

https://github.com/Betterbird/thunderbird-patches/issues/235#issuecomment-1862482806

DutchFlander commented 6 months ago

Thanks, I disabled the tray icon and restarted to see. BTW when I put my laptop to sleep, Betterbird doesn't run, I always quit before.

Let's see if this solves the issue or not, I'll report back soon.

Betterbird commented 6 months ago

Are you using Gnome and/or Wayland?

DutchFlander commented 6 months ago

Both: image

Betterbird commented 6 months ago

Then you're in trouble.

DutchFlander commented 6 months ago

No more sudden quits when no tray icon is enabled, so this is the reason for this issue. Any plan of fixing it, or it is a Gnome/Wayland bug? If not, I don't mind and this ticket can be closed.

Thanks

Betterbird commented 6 months ago

Yes, it's Gnome/Wayland bug, or at least triggered by this combination, I'm not even sure whether Wayland is officially supported by the Mozilla platform.

We'd like to fix it, but someone would have to debug it to see where it crashes.

Betterbird commented 6 months ago

I've closed issue #235 and we continue here. Is there anyone who can run a debug version and get a stack of the crash. @mfschumann, have you seen this issue?

For debugging you can install the eu.betterbird.Betterbird.Debug flatpak and then follow this guide to run Betterbird under gdb:

https://blogs.gnome.org/mclasen/2017/01/20/debugging-a-flatpak-application/

mfschumann commented 6 months ago

Yes, I had random crashes too. It was not too frequent, so I never gave much thought to it. I'll try to get a stack trace of such a crash.

mfschumann commented 6 months ago

When resuming from suspend, previously running BB was gone. This is what gdb outputs for the crash:

Thread 9 "Socket Thread" received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x7fffea5fe6c0 (LWP 35)]
0x00007ffff7b2bcba in send () from /usr/lib/x86_64-linux-gnu/libc.so.6
(gdb) Exiting due to channel error.
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: CompositorBridgeChild receives IPC close with reason=AbnormalShutdown (t=5841.99) Exiting due to channel error.

Is this enough for starting to debug or do you have any suggestions to get more verbose info?

mfschumann commented 6 months ago

This is what where and list return after the crash:

(gdb) where
#0  0x00007ffff7b2bcba in send () at /usr/lib/x86_64-linux-gnu/libc.so.6
(gdb) list
277     // argc & argv will be updated with the values passing from the
278     // chrome process.  With the new values, this function
279     // continues the reset of the code acting as a content process.
280     if (gBootstrap->XRE_ForkServer(&argc, &argv)) {
281       // Return from the fork server in the fork server process.
282       // Stop the fork server.
283       gBootstrap->NS_LogTerm();
284       return 0;
285     }
286     // In a content process forked from the fork server.

I also tried to save a stack trace as described here but the crash_bt.log file was not created after running the commands.

Betterbird commented 6 months ago

Thanks for trying, but this doesn't look like anywhere near the BB's code.

You can see that BB code includes libayatana-appindicator in this patch: https://github.com/Betterbird/thunderbird-patches/blob/main/115/features/12-feature-linux-systray.patch

The library does the "magic" via DBUS calls, for example, search for g_dbus_connection_emit_signal() calls.

It doesn't do any pipes or forks. If I see this correctly, it crashes in libc. So nothing in our control. To get the call stack in gdb, type bt or backtrace. DumpJSStack() is a Mozilla function that will dump put the JS call stack, since Mozilla code alternates between JS execution and C++ execution, via XPCOM. However, that's not relevant there since all the system tray stuff is programmed in C++ and perhaps C, so just getting the stack trace is the way to go.

If it crashed again, please type bt so we can see what the caller of that libc code is.

mfschumann commented 6 months ago

I got another crash with a more helpful trace from gdb. This time it seems related to the systray icon:

#6  0x00007ffff0dc440c in app_indicator_set_tooltip_full (self=0x7fffc8f55920, icon_name=0x0, title=0x7fffb12a0408 "1 ungelesene Nachricht\nPosteingang: 1", body=0x0) at /run/build/betterbird/comm/third_party/appindicator/app-indicator.c:2406
#7  0x00007ffff0b62bdb in nsMessengerUnixIntegration::UpdateUnreadCount(unsigned int, nsTSubstring<char16_t> const&) (this=<optimized out>, unreadCount=1, unreadTooltip=<optimized out>) at /run/build/betterbird/comm/mailnews/base/src/nsMessengerUnixIntegration.cpp:258

Full output of bt: trace.txt

Output of thread apply all bt full (incomplete, as gdb crashed during execution): trace_full.txt

Betterbird commented 6 months ago

Thanks @mfschumann! The full trace is not required. The normal trace has the relevant information:

#0  0x00007ffff6fd56e6 in gtk_status_icon_set_tooltip_markup () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#1  0x00007ffff0dc6ac6 in tooltip_changes (self=<optimized out>, data=0x7fffc08b91d0) at /run/build/betterbird/comm/third_party/appindicator/app-indicator.c:1979
#2  0x00007ffff6a6a4ea in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#3  0x00007ffff6a99b86 in signal_emit_unlocked_R.isra.0 () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#4  0x00007ffff6a8a92e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#5  0x00007ffff6a8ac03 in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#6  0x00007ffff0dc440c in app_indicator_set_tooltip_full (self=0x7fffc8f55920, icon_name=0x0, title=0x7fffb12a0408 "1 ungelesene Nachricht\nPosteingang: 1", body=0x0) at /run/build/betterbird/comm/third_party/appindicator/app-indicator.c:2406
#7  0x00007ffff0b62bdb in nsMessengerUnixIntegration::UpdateUnreadCount(unsigned int, nsTSubstring<char16_t> const&) (this=<optimized out>, unreadCount=1, unreadTooltip=<optimized out>) at /run/build/betterbird/comm/mailnews/base/src/nsMessengerUnixIntegration.cpp:258

So we come to update the tooltip since new mail has arrived:

1 ungelesene Nachricht
Posteingang: 1

That calls app_indicator_set_tooltip_full() which emits some signals. Eventually we come back into our code in tooltip_changes() and call gtk_status_icon_set_tooltip_markup(). which crashes in libgtk-3.so.0. Hmm, unless we pass wrong parameters, which is unlikely since it works without Wayland, it's beyond our control.

If you have more crashes, can you please provide more back traces (only the last few lines) to confirm that it always crashes the same way.

mfschumann commented 6 months ago

I got some more traces from recent crashes: trace.txt

They all seem to take the same path, which is however slightly different from the one posted above. Here, after calling gtk_status_icon_set_tooltip_markup(), WasmTrapHandler is being executed to handle the signal generated in libgtk-3.so.0?

#0  0x00007f00c26a3e14 in __pthread_kill_implementation () at /usr/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f00c2651dce in raise () at /usr/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f00bb702d11 in nsProfileLock::FatalSignalHandler(int, siginfo_t*, void*) (signo=11, info=0x7fff2eb91a30, context=<optimized out>) at /run/build/betterbird/toolkit/profile/nsProfileLock.cpp:174
#3  0x00007f00bc296c5a in WasmTrapHandler(int, siginfo_t*, void*) (signum=11, info=0x7fff2eb91a30, context=0x7fff2eb91900) at /run/build/betterbird/js/src/wasm/WasmSignalHandlers.cpp:794
#4  0x00007f00c2651e80 in <signal handler called> () at /usr/lib/x86_64-linux-gnu/libc.so.6
#5  0x00007f00c1bd56e6 in gtk_status_icon_set_tooltip_markup () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#6  0x00007f00bb9c6ac6 in tooltip_changes (self=<optimized out>, data=0x7f00886c59f0) at /run/build/betterbird/comm/third_party/appindicator/app-indicator.c:1979
#7  0x00007f00c166a4ea in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#8  0x00007f00c1699b86 in signal_emit_unlocked_R.isra.0 () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#9  0x00007f00c168a92e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#10 0x00007f00c168ac03 in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#11 0x00007f00bb9c440c in app_indicator_set_tooltip_full (self=0x7f0093654b60, icon_name=0x0, title=0x7f007cd97f88 "1 ungelesene Nachricht\nPosteingang: 1", body=0x0) at /run/build/betterbird/comm/third_party/appindicator/app-indicator.c:2406
Betterbird commented 6 months ago

Thanks, that's very helpful, we'll check what's happening in the code. Weird that it only crashes sometimes and only on Wayland. You do get some working tooltips when hovering the icon in the system tray normally without a crash, right?

mfschumann commented 6 months ago

No, tooltips don't work in Gnome because libappindicator does not support them.

Betterbird commented 6 months ago

It's the wrong way around. libappindicator doesn't support tooltips, but in BB we merged a PR so it does. How else would they be working in KDE and Xfce. The "AppIndicator and KStatusNotifierItem Support" extension doesn't support them, as per the link you quoted: "GNOME designers decided not to have tooltips in the shell and I'd like to honor that decision."

So in effect, we're crashing the system for something that doesn't work in the first place. We can fix that: If on Gnome, don't even try to set a tooltip. Case closed.

Betterbird commented 6 months ago

Commit https://github.com/Betterbird/thunderbird-patches/commit/cf7c513584195e2371b65c2486c4eb472069bfdc should fix this issue. @mfschumann, if you have time, you can build a FlatPak for your own use to see whether the crashes are gone. The fix is trivial, just no setting of tooltips under Gnome.

mfschumann commented 6 months ago

Thanks. I built a flatpak based on the commit and have tested for a day, including a couple of suspend/resume cycles. I have not had any crashes so far, so I think this issue can be closed.

Betterbird commented 6 months ago

Thanks for building/testing. We'll close it when we ship the fix so other people may find it in the meantime.

Betterbird commented 6 months ago

Should be fixed in 115.8.0-bb24.

DutchFlander commented 6 months ago

No issue since yesterday, not even after sleep on Gnome/Wayland :)

Thanks!

Betterbird commented 6 months ago

Sorry it took so long to fix. We had Issue #235 on file since December 2023, but without someone debugging it, we couldn't action it. Special thanks for @mfschumann for getting us the required crash stack dumps.