Ximi1970 / systray-x

SysTray-X: A system tray extension for Thunderbird. Needs both the addon AND the companion app installed to work. Will not work with TB flatpaks or snaps.
Mozilla Public License 2.0
214 stars 14 forks source link

Fix hang when getting Window Frame Extensions #94

Closed tatokis closed 2 years ago

tatokis commented 3 years ago

The whole application (including the debug window) would hang randomly with no apparent cause, making the X button in Thunderbird unresponsive.

The following was observed after attaching gdb:

[...]
#2  0x00007ffb577ec90a in xcb_wait_for_event () at /lib/x86_64-linux-gnu/libxcb.so.1
#3  0x00007ffb58ecec08 in _XReadEvents () at /lib/x86_64-linux-gnu/libX11.so.6
#4  0x00007ffb58ebd758 in XNextEvent () at /lib/x86_64-linux-gnu/libX11.so.6
#5  0x000055ad74d47d2b in GetWindowFrameExtensions(void*, unsigned long long, long*, long*, long*, long*) ()
#6  0x000055ad74d3c351 in WindowCtrlUnix::updatePositions() ()
#7  0x000055ad74d38718 in WindowCtrl::slotWindowState(Preferences::WindowState) ()
[...]

It appears that sometimes the call to XGetWindowProperty() in GetWindowFrameExtensions() succeeds, but len == 0, which results in XNextEvent() being called, waiting for an event that is never received. This presumably happens because the WM does not have the frame extensions ready when the request is made.

It is not clear why XNextEvent() is being called in the first place, as we are not handling any events from X in that function.

Instead, a loop was added that tries up to five times to get the window frame extensions, waiting for 50ms between calls.

This shouldn't cause any significant slowdown, as most calls will succeed within one or two iterations, and it will prevent hangs if the calls fail all five times.

Ximi1970 commented 3 years ago

Thank you for the patch. Will keep it parked for now untill I finshed the javascript refactor.
Can you please tell me which setup you are using? I never experienced any hanging when I checked 0.7 on 20 (most "slow" virtual) systems. Things will change of course especially if you are using a rolling release.

tatokis commented 3 years ago

Can you please tell me which setup you are using? I never experienced any hanging when I checked 0.7 on 20 (most "slow" virtual) systems.

I wouldn't call my system slow by any means. The CPU is a Ryzen 9 3950X with an RX580 and plenty of RAM, running Ubuntu 20.04. I think the issue is with how Unity/compiz manages windows, as that's what I'm using for my desktop.

From my testing, once that XNextEvent() was called and blocked the execution, I could never get it to receive any events no matter what I did with the TB window or the tray icon. The only way to "fix" it was to kill the process and restart TB. That's I am confused as to why it is being called there in the first place.

Ximi1970 commented 2 years ago

Can you give me the details mentioned in #4 ? Thanks.

tatokis commented 2 years ago

Ubuntu 20.04 amd64 Unity XDG_SESSION_TYPE=x11 TB 78.13.0 (64-bit) Built manually from the develop branch at commit dedba95d02573120a8fc68ae5621c7d065643f29 System Qt Using Qt version 5.12.8 in /usr/lib/x86_64-linux-gnu

That aside, could you please elaborate on why XNextEvent is called at that location?

Ximi1970 commented 2 years ago

Got the snippet of code from stackoverlow: https://stackoverflow.com/questions/36188154/get-x11-window-caption-height

Ximi1970 commented 2 years ago

Your patch did some weird things under KDE so I tried another solution by using XPending() in the while loop for checking if there is an event. This works nicely in KDE (no strange behavour yet) but I can not replicate your problem with the hangups. My solution is now in the develop branch.
Can you have a look and test if this solves your problem?

Ximi1970 commented 2 years ago

Oke, looks like the Xpending is not the solution. Hangup when the title bar is disabled. But hey, got a decent reproducable problem now. :smile:

tatokis commented 2 years ago

Apologies for the late reply, I completely forgot about this.

Looking at the original code on stack overflow, it works because of

    XSelectInput(d, w, ExposureMask|ButtonPressMask|KeyPressMask|PropertyChangeMask);

This enables PropertyChangeMask which sends notifications/events whenever a property is changed. This way, XNextEvent just blocks until the WM has finished drawing the Extents and adding the property. Once the property is added, the event is sent and XNextEvent returns, the code runs again, and the XGetWindowProperty call succeeds.

If we enable this type of event and just blindly call XNextEvent should be okay, as we aren't expecting any other type of events, so in theory we can't get another kind of event and discard it without processing it, and we don't care about any other kind of property. However I'd advise against it as it can cause confusion and unwanted behaviour in the future if more features are added.

From what I can see, PropertyChangeMask is never set by systray-x on the thunderbird window, thus never expecting any events and blocking forever when XNextEvent is called.

The proper solution would be to have an X event loop of sorts.

What "weird behaviour" did my code cause?

Oke, looks like the Xpending is not the solution. Hangup when the title bar is disabled. But hey, got a decent reproducable problem now. smile

As for XPending, it is probably not too useful on its own. It doesn't solve the problem of being able to peek into the queue to see if there's an event of the type we want, and it also doesn't wait until the extents are available. If I am not mistaken, we still need to blindly dequeue events, and if it's not what we want, we still need to discard it.

The hang when title bars are disabled is a completely different problem. My current solution with the timer should handle it, but this one is not ideal either.

The real issue here is that the api was designed with an event loop in mind. Typically you ask for the events you want to receive, and then just keep dequeueing events with a switch to handle the different types. When you get a property change event, you handle it accordingly. If it's the extent property, then you do what you're supposed to do with it. If you never get a property change event then you simply don't check for the extent property, which would happen if there were no title bars at all.

Even if we were to use XPending, we still need to wait before calling it, as if an event is coming after our XPending call, due to the WM not having finished drawing yet, it'll sit in the queue until next time, and the queue will be behind. Usually XPending is used inside a third party event loop, so that you don't have to have multiple ones running.

Avoiding the event loop is probably a good idea as it will keep the code simple, however, the only way I can currently think of doing it is with, once again, a timer loop, and XPending/XNextEvent. But this would be a more complex version of what my current code in this PR is doing...

Ximi1970 commented 2 years ago

I changed the call. Should not be hanging anymore.