wmww / gtk4-layer-shell

A library to create panels and other desktop components for Wayland using the Layer Shell protocol and GTK4
MIT License
128 stars 4 forks source link

Window.set_visible segfault with gtk4_layer_shell #23

Closed fr33zing closed 8 months ago

fr33zing commented 1 year ago

Bug description

Segfault only occurs after using gtk4_layer_shell::init_for_window to setup a window. Using gtk::Window.set_visible then has a small chance to cause a segfault. I've included a minimal repro to create a window, initialize it with layer shell, and then toggle its visibility rapidly in order to make the segfault occur faster.

Further details

For further details including repro code (rust) and logs, please see here:

https://github.com/gtk-rs/gtk4-rs/issues/1402

fr33zing commented 1 year ago

I've determined that this is not an issue with the rust bindings for gtk4 or layer shell (my original thought) but that it's either an issue with the gtk4-layer-shell or gtk4 libraries.

This repro in C also causes the segfault:

#include <gtk/gtk.h>
#include <gtk4-layer-shell/gtk4-layer-shell.h>
#include <stdio.h>

#define PROJECT_NAME "gtk4-layer-shell-segfault-repro-c"
#define TOGGLE_VISIBILITY_INTERVAL_MS 100

GtkWidget *window;
int count = 0;

static gboolean toggle_visible(gpointer user_data) {
  bool visible = !gtk_widget_get_visible(window);
  gtk_widget_set_visible(window, visible);
  count++;
  printf("<%d, %s>\n", count, visible ? "true" : "false");

  return true;
}

static void activate(GtkApplication *app, gpointer user_data) {
  window = gtk_application_window_new(app);

  printf("calling gtk_layer_init_for_window\n");
  gtk_layer_init_for_window(GTK_WINDOW(window));

  printf("toggling visibility rapidly\n");
  g_timeout_add(TOGGLE_VISIBILITY_INTERVAL_MS, toggle_visible, user_data);
}

int main(int argc, char **argv) {
  GtkApplication *app =
      gtk_application_new("lets.cause.a.segfault", G_APPLICATION_DEFAULT_FLAGS);
  g_signal_connect(app, "activate", G_CALLBACK(activate), NULL);

  int status = g_application_run(G_APPLICATION(app), argc, argv);
  g_object_unref(app);

  return status;
}
fr33zing commented 1 year ago

I've also filed this issue for GTK: https://gitlab.gnome.org/GNOME/gtk/-/issues/5881

wmww commented 1 year ago

Sorry for the delay. As I said in the other issue, probably best to close the issue on GTK as they can't help with gtk-layer-shell problems. This library does some weird stuff so any crash involving it should be assumed to be it's fault until proven otherwise.

I ran the example C code on Ubuntu 22.04 with various versions of GTK and libwayland built from source without being able to repro. Will try again in a couple weeks on Arch once I get home from my current trip.

In the mean time if someone who can repro can get it to crash in GDB and post a backtrace that would be helpful. Especially if you have debugging symbols for GTK and/or libwayland (either installed or are using a copy of those libraries built with them)

fr33zing commented 1 year ago

Sorry for the delay.

Don't worry about it. I really appreciate your hard work on gtk4-layer-shell and I'm happy that you're looking into this issue.

I've closed the issue on the GNOME gitlab instance. Hopefully I didn't waste too much of their time. I will try to be more considerate next time I have a bug to file. As for the repro, I'll try to see if I can test it in some more environments. I did not test it on GNOME, so it could be that it works fine on GNOME and crashes on other wayland compositors. If you didn't see it, I posted a GDB backtrace here. I'm not sure how to tell if this includes the symbols you mentioned. If this backtrace is insufficient, please let me know and I'll attempt to get another.

This isn't urgent for me, so please enjoy your trip.

ErikReider commented 1 year ago

Getting something similar:

free(): invalid pointer

Thread 1 "swaync" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at pthread_kill.c:44
44       return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
--Type <RET> for more, q to quit, c to continue without paging--Quit
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at pthread_kill.c:44
#1  0x00007ffff6c688b3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007ffff6c17abe in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ffff6c0087f in __GI_abort () at abort.c:79
#4  0x00007ffff6c0160f in __libc_message (fmt=fmt@entry=0x7ffff6d7b52f "%s\n")
    at ../sysdeps/posix/libc_fatal.c:150
#5  0x00007ffff6c72775 in malloc_printerr (str=str@entry=0x7ffff6d78fcc "free(): invalid pointer")
    at malloc.c:5651
#6  0x00007ffff6c74654 in _int_free (av=<optimized out>, p=p@entry=0xb50ec0, have_lock=have_lock@entry=0)
    at malloc.c:4425
#7  0x00007ffff6c771ce in __GI___libc_free (mem=mem@entry=0xb50ed0) at malloc.c:3367
#8  0x00007ffff7c87895 in g_free (mem=0xb50ed0) at ../glib/gmem.c:232
#9  0x00007ffff7c816a9 in g_source_unref_internal (source=0xc7ca60, context=0x50a550, have_lock=<optimized out>)
    at ../glib/gmain.c:2400
#10 0x00007ffff7c834d8 in g_main_dispatch (context=0x50a550) at ../glib/gmain.c:3494
#11 g_main_context_dispatch (context=0x50a550) at ../glib/gmain.c:4200
#12 0x00007ffff7ce1438 in g_main_context_iterate.isra.0
    (context=0x50a550, block=1, dispatch=1, self=<optimized out>) at ../glib/gmain.c:4276
#13 0x00007ffff7c80a23 in g_main_context_iteration (context=context@entry=0x50a550, may_block=may_block@entry=1)
    at ../glib/gmain.c:4343
#14 0x00007ffff7eb7ffd in g_application_run (application=0x66c910, argc=<optimized out>, argv=0x0)
    at ../gio/gapplication.c:2573
#15 0x00000000004116eb in sway_notification_center_main (args=0x7fffffffd158, args_length1=7)
    at ../src/main.vala:82
#16 0x0000000000411725 in main (argc=7, argv=0x7fffffffd158) at ../src/main.vala:13
wmww commented 10 months ago

Finally back to this. Unable to reproduce with the C example code. Distro: Arch gtk-layer-shell: current main (same commit as 1.0.1) gtk: 4.12.1 compositors:

I've let it run for at least a thousand iterations on both sway and hyprland with no crash. Can you test again with up to date versions of everything to see if you can still repro?

MalpenZibo commented 9 months ago

I have the same issue. In the next few days I'll try to create a simple example to replicate the issue.

(Arch Linux + Hyprland)

MalpenZibo commented 9 months ago

Ok, I'm able to replicate the issue with the following rust code

use gdk::prelude::ApplicationExt;
use gdk::prelude::ApplicationExtManual;
use gtk::traits::ButtonExt;
use gtk::{
    traits::{GtkWindowExt, WidgetExt},
    ApplicationWindow,
};

pub fn main() {
    let app = gtk::Application::builder().application_id("test").build();

    app.connect_startup(move |app| {
        let window = build_surface(app);

        let window2 = build_surface2(app);
        let label = gtk::Label::new(Some("Child panel"));
        window2.set_child(Some(&label));

        let button = gtk::Button::with_label("Open child panel");

        button.connect_clicked(move |_| {
            if window2.is_visible() {
                window2.hide();
            } else {
                window2.show();
            }
        });
        window.set_child(Some(&button));

        app.connect_activate(move |_| window.show());
    });

    app.run();
}

fn build_surface(app: &gtk::Application) -> ApplicationWindow {
    let window = gtk::ApplicationWindow::new(app);
    window.set_default_size(-1, 50);

    gtk4_layer_shell::init_for_window(&window);
    gtk4_layer_shell::set_layer(&window, gtk4_layer_shell::Layer::Overlay);

    gtk4_layer_shell::auto_exclusive_zone_enable(&window);

    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Left, true);
    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Top, true);
    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Right, true);

    window
}

fn build_surface2(app: &gtk::Application) -> ApplicationWindow {
    let window = gtk::ApplicationWindow::new(app);
    window.set_default_size(250, 250);

    gtk4_layer_shell::init_for_window(&window);
    gtk4_layer_shell::set_layer(&window, gtk4_layer_shell::Layer::Overlay);

    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Left, true);
    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Top, true);
    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Right, true);
    gtk4_layer_shell::set_anchor(&window, gtk4_layer_shell::Edge::Bottom, true);

    window
}

And after some open/close (every time the number change, sometimes it crash at the first click sometimes after more than 10 open and close) I get

Gsk-CRITICAL **: 14:07:05.369: gsk_render
_node_unref: assertion 'GSK_IS_RENDER_NODE (node)' failed
[1]    261357 segmentation fault (core dumped) 

I'm using arch linux and hyprland 0.29,

MalpenZibo commented 9 months ago

Update One very good news one bad news

use gtk4::prelude::ApplicationExtManual;
use gtk4::ApplicationWindow;
use gtk4::prelude::WidgetExt;
use gtk4::prelude::GtkWindowExt;
use gtk4::Application;
use gtk4_layer_shell::LayerShell;
use gtk4::prelude::ApplicationExt;
use gtk4::prelude::ButtonExt;
use gtk4::Button;
use gtk4::Label;

pub fn main() {
    let app = Application::builder().application_id("test").build();

    app.connect_startup(move |app| {
        let window = build_surface(app);

        let window2 = build_surface2(app);
        let label = Label::new(Some("Child panel"));
        window2.set_child(Some(&label));

        let button = Button::with_label("Open child panel");

        button.connect_clicked(move |_| {
            if window2.is_visible() {
                window2.hide();
            } else {
                window2.show();
            }
        });
        window.set_child(Some(&button));

        app.connect_activate(move |_| window.show());
    });

    app.run();
}

fn build_surface(app: &Application) -> ApplicationWindow {
    let window = ApplicationWindow::new(app);
    window.set_default_size(-1, 50);

    window.init_layer_shell();
    window.auto_exclusive_zone_enable();

    window.set_anchor(gtk4_layer_shell::Edge::Left, true);
    window.set_anchor(gtk4_layer_shell::Edge::Top, true);
    window.set_anchor(gtk4_layer_shell::Edge::Right, true);

    window
}

fn build_surface2(app: &Application) -> ApplicationWindow {
    let window = ApplicationWindow::new(app);
    window.set_default_size(250, 250);

    window.init_layer_shell();
    window.auto_exclusive_zone_enable();

    window.set_anchor(gtk4_layer_shell::Edge::Left, true);
    window.set_anchor(gtk4_layer_shell::Edge::Top, true);
    window.set_anchor(gtk4_layer_shell::Edge::Right, true);
    window.set_anchor(gtk4_layer_shell::Edge::Bottom, true);

    window
}

Cargo.toml

gtk4-layer-shell = "0.2.0"
gtk4 = "0.7.3"
gdk4 = "0.7.3"

I changed the example code updating the gtk4-layer-shell crate to the last version, I changed the gtk4 and gdk4 to the latest version removing the features and package references and I have compiled gtk4-layer-shell from code.

Now the issue seems resolved but I don't know why (that's the bad news :smile:).

wmww commented 8 months ago

I've now released v1.0.2, which I suspect contains changes that fix this. If there's still a way to reproduce the crash with 1.0.2 feel free to re-open the issue. Please provide sample code, make files/cargo.toml files where applicable and versions of all relevant libraries (GTK, libwayland, and any language specific packages). I will look into it if I can reproduce.

ldelossa commented 6 months ago

Was running into this very issue. @wmww can confirm v1.0.2 fixes the issue. Thanks a lot.