scorpion-26 / gBar

Blazingly fast status bar written with GTK
MIT License
459 stars 17 forks source link

[BUG/FEATURE] Better multi-monitor support is needed #93

Open fdev31 opened 3 months ago

fdev31 commented 3 months ago

Description

gBar isn't 100% stable when adding or removing a monitor. It seems to implement some interesting features when the set of available monitor is changed but they don't look very practical.

Ideally, two things are needed:

  1. find the bug(s) which happens sometimes on plug/unplug of a monitor, not so easy to reproduce - which are almost the only moments where gBar crashes on my system
  2. instead of picking a "random" monitor when some is added or removed, the user could pass a list of monitors (from "most likely" to "least likely") and then the one with higher priority which is available is used for gBar

Reproduction

Unplug or Plug a monitor repeatedly

Expected behavior

gBar never crashed and is choosing the monitor which is the "best" when this setup is used

System information

Commit 6dd1ee6783a5dc195172c078bcd53ca258224854

Archlinux with Hyprland

Note

I implemented some workaround here https://github.com/hyprland-community/pyprland/wiki/gbar but I don't think the code will be useful at all. It would be extra-nice to be able to specify either name or a (partial) description of the monitors, eg: --monitors "SuperScreeXX,HDMI-A-1,DP-1,WELL-314X"

scorpion-26 commented 3 months ago

Yeah, the monitor reloading is not the most stable thing in the world and probably will never be as it is a giant hack. (Mostly because Gdk is not exposing any uniquely identifying information about a monitor)

  1. find the bug(s) which happens sometimes on plug/unplug of a monitor, not so easy to reproduce - which are almost the only moments where gBar crashes on my system

Can you compile gBar with debug on (meson setup build --buildtype=debug) and send me the gBar log + crash backtrace (Should be somewhere in dmesg or alternatively running throuhg gdb). That would be super helpful, as I'm unable to reproduce this.

  1. instead of picking a "random" monitor when some is added or removed, the user could pass a list of monitors (from "most likely" to "least likely") and then the one with higher priority which is available is used for gBar

Should be doable

fdev31 commented 3 months ago

Of course now it looks like I can't reproduce the crash anymore... I have many core dumps but they are not usable, I'll let you know if I can collect some in the future using the debug build.

fdev31 commented 3 months ago

I got a crash, not sure it relates to monitor operations. The file on disk is small but if I --export it, it's huge, not sure which one I need to share in such case:

coredumpctl dump gBar --output /tmp/export
           PID: 142832 (gBar)
           UID: 1000 (fab)
           GID: 1000 (fab)
        Signal: 11 (SEGV)
     Timestamp: Sat 2024-05-04 14:12:01 CEST (23min ago)
  Command Line: gBar bar 0
    Executable: /usr/bin/gBar
 Control Group: /user.slice/user-1000.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-1000.slice
       Session: 1
     Owner UID: 1000 (fab)
       Boot ID: 0bd97a8c737d474d9606d25d8c4a6864
    Machine ID: d9d42dc42d35439b91c4716a248fed81
      Hostname: gamix
       Storage: /var/lib/systemd/coredump/core.gBar.1000.0bd97a8c737d474d9606d25d8c4a6864.142832.1714824721000000.zst (present)
  Size on Disk: 1.8M
       Message: Process 142832 (gBar) of user 1000 dumped core.

                Stack trace of thread 142832:
                #0  0x0000791b8ccbb025 __libc_free (libc.so.6 + 0x9f025)
                #1  0x00005cce2c78b670 n/a (gBar + 0x1b670)
                #2  0x00005cce2c7ef79d n/a (gBar + 0x7f79d)
                #3  0x0000791b8d75c204 n/a (libgio-2.0.so.0 + 0xa1204)
                #4  0x0000791b8d76004d n/a (libgio-2.0.so.0 + 0xa504d)
                #5  0x0000791b8d7b7323 n/a (libgio-2.0.so.0 + 0xfc323)
                #6  0x0000791b8d75c204 n/a (libgio-2.0.so.0 + 0xa1204)
                #7  0x0000791b8d75c23d n/a (libgio-2.0.so.0 + 0xa123d)
                #8  0x0000791b8d5c8199 n/a (libglib-2.0.so.0 + 0x5a199)
                #9  0x0000791b8d6273bf n/a (libglib-2.0.so.0 + 0xb93bf)
                #10 0x0000791b8d5c7712 g_main_context_iteration (libglib-2.0.so.0 + 0x59712)
                #11 0x0000791b8dbed35b gtk_main_iteration (libgtk-3.so.0 + 0x1ed35b)
                #12 0x00005cce2c7891c6 n/a (gBar + 0x191c6)
                #13 0x00005cce2c78488a main (gBar + 0x1488a)
                #14 0x0000791b8cc41d4a n/a (libc.so.6 + 0x25d4a)
                #15 0x0000791b8cc41e0c __libc_start_main (libc.so.6 + 0x25e0c)
                #16 0x00005cce2c785275 n/a (gBar + 0x15275)

                Stack trace of thread 142841:
                #0  0x0000791b8cd2948d syscall (libc.so.6 + 0x10d48d)
                #1  0x0000791b8d622487 g_cond_wait (libglib-2.0.so.0 + 0xb4487)
                #2  0x0000791b8d592454 n/a (libglib-2.0.so.0 + 0x24454)
                #3  0x0000791b8d5f729e n/a (libglib-2.0.so.0 + 0x8929e)
                #4  0x0000791b8d5f6065 n/a (libglib-2.0.so.0 + 0x88065)
                #5  0x0000791b8ccaa1cf n/a (libc.so.6 + 0x8e1cf)
                #6  0x0000791b8cd2b6ec n/a (libc.so.6 + 0x10f6ec)

                Stack trace of thread 142843:
                #0  0x0000791b8cd1d9ed __poll (libc.so.6 + 0x1019ed)
                #1  0x0000791b8d627306 n/a (libglib-2.0.so.0 + 0xb9306)
                #2  0x0000791b8d5c8dc7 g_main_loop_run (libglib-2.0.so.0 + 0x5adc7)
                #3  0x0000791b8d7c483c n/a (libgio-2.0.so.0 + 0x10983c)
                #4  0x0000791b8d5f6065 n/a (libglib-2.0.so.0 + 0x88065)
                #5  0x0000791b8ccaa1cf n/a (libc.so.6 + 0x8e1cf)
                #6  0x0000791b8cd2b6ec n/a (libc.so.6 + 0x10f6ec)

                Stack trace of thread 142842:
                #0  0x0000791b8cd1d9ed __poll (libc.so.6 + 0x1019ed)
                #1  0x0000791b8d627306 n/a (libglib-2.0.so.0 + 0xb9306)
                #2  0x0000791b8d5c7712 g_main_context_iteration (libglib-2.0.so.0 + 0x59712)
                #3  0x0000791b8d5c7762 n/a (libglib-2.0.so.0 + 0x59762)
                #4  0x0000791b8d5f6065 n/a (libglib-2.0.so.0 + 0x88065)
                #5  0x0000791b8ccaa1cf n/a (libc.so.6 + 0x8e1cf)
                #6  0x0000791b8cd2b6ec n/a (libc.so.6 + 0x10f6ec)

                Stack trace of thread 142846:
                #0  0x0000791b8cd1d9ed __poll (libc.so.6 + 0x1019ed)
                #1  0x0000791b8d627306 n/a (libglib-2.0.so.0 + 0xb9306)
                #2  0x0000791b8d5c7712 g_main_context_iteration (libglib-2.0.so.0 + 0x59712)
                #3  0x0000791b857affde n/a (libdconfsettings.so + 0x5fde)
                #4  0x0000791b8d5f6065 n/a (libglib-2.0.so.0 + 0x88065)
                #5  0x0000791b8ccaa1cf n/a (libc.so.6 + 0x8e1cf)
                #6  0x0000791b8cd2b6ec n/a (libc.so.6 + 0x10f6ec)

                Stack trace of thread 142850:
                #0  0x0000791b8cd2948d syscall (libc.so.6 + 0x10d48d)
                #1  0x0000791b8d622487 g_cond_wait (libglib-2.0.so.0 + 0xb4487)
                #2  0x0000791b8d592454 n/a (libglib-2.0.so.0 + 0x24454)
                #3  0x0000791b8d5924bc g_async_queue_pop (libglib-2.0.so.0 + 0x244bc)
                #4  0x0000791b8d09fc48 n/a (libpangoft2-1.0.so.0 + 0x9c48)
                #5  0x0000791b8d5f6065 n/a (libglib-2.0.so.0 + 0x88065)
                #6  0x0000791b8ccaa1cf n/a (libc.so.6 + 0x8e1cf)
                #7  0x0000791b8cd2b6ec n/a (libc.so.6 + 0x10f6ec)
                ELF object binary architecture: AMD x86-64

Also it has much less debug info that I expected, I'll check my build again.

EDIT: found the issue, there was a "strip" phase during the package build

fdev31 commented 3 months ago

I got this, it happened little after I turned a monitor on, this time witht he needed debug info:

image

What else must I send you and how?

scorpion-26 commented 3 months ago

Looks like a memory corruption error, which is bad. Can you run it with valgrind again? Unfortunately there isn't any way to get the underlying issue just from (crash) logs.

fdev31 commented 3 months ago

I'll when I can, unless there is a way to run it non interactively and still collect what is needed. I didn't use it a lot and not sure it can just run from a script without a tty

fdev31 commented 3 months ago

I'll when I can, unless there is a way to run it non interactively and still collect what is needed. I didn't use it a lot and not sure it can just run from a script without a tty

It's never crashing when I run in valgrind...

scorpion-26 commented 3 months ago

It's never crashing when I run in valgrind...

That's not unexpected. Is it at least logging some memory access errors?

fdev31 commented 3 months ago

It's never crashing when I run in valgrind...

That's not unexpected. Is it at least logging some memory access errors?

I didn't check, just hitting ^C and checking the output?

fdev31 commented 3 months ago

It's so fast to start, with an auto restart I can ignore it... :D

I noticed sometimes the crash is helpful: when I start gBar with the session, very often the bluetooth module isn't loaded... but on restart I can see it.

This is probably a new bug/improvement, do you mind if I open a new ticket about that? I didn't check the code at all but I guess it's dbus-based and there could be some signal to wait / some retry system that may fix this.

Anyway, gBar is my bar everywhere now, thank you for this!

scorpion-26 commented 3 months ago

In case there are any memory errors, it should just say something like this in the standard output:

===<pid>== Invalid (write|read) of size x
... 

e.g.:

==400175== Invalid write of size 4
==400175==    at 0x1182FC: main (gBar.cpp:123)
==400175==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

This is probably a new bug/improvement, do you mind if I open a new ticket about that? I didn't check the code at all but I guess it's dbus-based and there could be some signal to wait / some retry system that may fix this.

Yeah, bluetooth is queried via dbus, feel free to open an issue (Though tbh, that sounds like an issue that is very hard to reproduce). But why don't you start gbar with the compositor? Then the dbus service should be initialized correctly.

fdev31 commented 3 months ago

OK, I'll look at it. gBar is already started by my compositor... but on the fastest machine I have, bluetooth is "often" (didn't make stats) missing initially.... while this doesn't happen on a more "average" computer I'm also using a lot.