Open tsujan opened 1 year ago
Thanks for reporting. Have not seen that before!! What's the stuff in the top left corner? Could you send config for that - to help me reproduce (in case it's related)
dragging its title-bar and dropping it to the bottom edge
Are you sure? IIRC we only support dropping at top (for Maximize
) and sides (for SnapToEdge
)
Are you sure? IIRC we only support dropping at top (for
Maximize
) and sides (forSnapToEdge
)
DND to bottom is similar to sides, half screen here.
Are you sure? IIRC we only support dropping at top (for
Maximize
) and sides (forSnapToEdge
)DND to bottom is similar to sides, half screen here.
Yes, you're right. I've just read the manual to remind myself. Ignore my comment.
Have not seen that before
Because it happens very randomly and rarely. However, I might be able to make it happen for the bottom edge if I try hard. So, if you make a fixing patch that can be applied or adapted to 0.6.5, I could test it.
What's the stuff in the top left corner?
Nothing. I sent the bottom part of the screenshot. Waybar is at the top edge of the screen, without any window rule. And PCManFM-Qt draws the desktop.
It seems that the non-resized title-bar always fills the left side (even when the DND is made slightly to the the right of the bottom edge), and that its size is the width of the dragged window.
EDIT: The workspace has no role in it.
EDIT1: I encountered it with two apps with very different codes, one in Qt6 and the other in Qt5, namely, FeatherPad and QTerminal. The reason may be that I only tile the windows of those apps.
This may happen when the client doesn't respond fast enough, do you have anything like
client (some_client_app_id) did not respond to configure request in 100 ms
in your logs?
Edit:
You need to start labwc with the -V
(or -d
) flag for the message to show up.
Edit 2:
If you feel like testing you could also manually increase the timeout in src/xdg.c
(on top, CONFIGURE_TIMEOUT_MS
).
Edit 3: Cross posted from #1195
The system log is clean.
~/.local/share/sddm/wayland-session.log
only shows lines like this about LabWC:00:25:27.484 [ERROR] [../labwc-0.6.5/src/desktop.c:422] Unknown node detected
See above, for that message to appear labwc need to at least run with --verbose
(or --debug
for even more log spam).
Yes, this appeared after I added -V
and reproduced the issue with QTerminal:
00:01:59.474 [INFO] [../labwc-0.6.5/src/xdg.c:117] client (qterminal) did not respond to configure request in 100 ms
EDIT: I'm going to test with #define CONFIGURE_TIMEOUT_MS 200
; will report back.
I tested with 200 and 250 for CONFIGURE_TIMEOUT_MS
, but the issue happened as before.
Also, I found out that it isn't related to how fast I snap by DND; a very slow DND can have the same effect.
The irony is that, as the reporter, I don't tile any window to the bottom region when actually working — only the bottom right or left.
Hm.. the existence of that entry in the log itself shouldn't really happen. To figure out if that is just a coincidence or the actual cause of the issue, could you try with CONFIGURE_TIMEOUT_MS 10000
(e.g. 10 seconds) or something ridiculous like that?
Also, I found out that it isn't related to how fast I snap by DND; a very slow DND can have the same effect.
The speed at which the DND happens by the user shouldn't really matter, the timeout refers to the time an application needs to respond to a configure request from labwc.
As I'm getting "skillful" in reproducing it, it was easy to make it happen after setting CONFIGURE_TIMEOUT_MS
to 10000
— I just waited 10 seconds after seeing that nothing happened immediately after DND to the bottom.
@tsujan Which application(s) does it happen with? All Qt5/Qt6, featherpad/qterminal only?
Does it happen if you start labwc without config (labwc -C /dev/null
)
I've been trying with featherpad 1.4.1
but still can't reproduce
Which application(s) does it happen with?
Any Qt5 or Qt6 app I tried: FeatherPad (Qt6), QTerminal (Qt5), PCManFM-Qt (Qt6), SpeedCrunch (Qt5)...
I've been trying with featherpad 1.4.1 but still can't reproduce
You don't have my "skill" :P
Since you're both so responsive and I don't want to waste your time, I just wanted to say that I'll be away from the computer for a while. Will come back later....
You don't have my "skill" :P
After trying maybe 30 times I'm up to your skills. Dragging to the bottom left and re-dragging to center screen again several times triggers it here too. It's hard to trigger. It wasn' easy to make a screenshot using the shortcuts as it turned immediately well when pressing them, had to switch desktop and use sleep
.
After trying maybe 30 times I'm up to your skills.
Good :)
Coming back with four more "discoveries" (the first two may be useless):
@johanmalm, @Consolatis, unless you know that QtWayland can't have a role in this, it can be closed as far as I'm concerned.
I tried with thunderbird too, no success.
Some observations from trying it with calibre on arch:
For me it only happened the first time I dragged it against the bottom edge of the screen. After closing and opening calibre again the bug always happened if it was the first thing I tried. It only happened if I didn't snap it somewhere else or resized it first. When starting calibre with "QT_QPA_PLATFORM=xcb" (xwayland) dragging it against the bottom edge wasn't working at all, against the sides did though. Moving the calibre window so part of it is off screen, moving it back and then dragging it against the screen edge prevented the bug from occuring. I didn't release it after I started moving it.
I wasn't able to reproduce it on gentoo.
Some observations from trying it with calibre on arch
A Qt app, again.
It would be interesting to know if each time this issue occurs there is also the did not respond to configure request
log entry.
And similarly, if it doesn't occur if there is no log entry. I am still not sure if that is actually related or just a coincidence and we should look elsewhere for a cause.
Another thing that could help to debug this is a WAYLAND_DEBUG=1
log from the application at the point when it happens.
That it only seems to happen with Qt applications doesn't really matter that much for assigning a priority to the issue. My usual stance is "we don't like working around client bugs" (and it isn't even verified that its a client bug at all) but for whole GUI frameworks that a lot of applications depend on we should IMHO try to make it still work. As that is what a user sees when using labwc.
It would be interesting to know if each time this issue occurs there is also the did not respond to configure request log entry. And similarly, if it doesn't occur if there is no log entry.
Both confirmed with my tests.
Hm.. I have a slight feeling that us resetting the pending serial to 0
in handle_configure_timeout()
sometimes prevents handle_commit()
to set update_required = true
as the pending serial doesn't match the received one. But I didn't really think this through. Thoughts @jlindgren90 ?
I think that the fact that the timeout occurs most likely indicates a client (probably Qt) issue. The mismatch between title bar and client surface is an (expected) side effect of that. Normally we try to resize the title bar on the same frame as the client surface so it looks "smooth" but if the client is lagging, we will go ahead and resize the title bar anyway (after the timeout).
I don't think resetting the serial in the timeout handler makes a difference - the issue is caused by the client surface not resizing and has already occurred at that point.
Probably it would be necessary to sniff the Wayland connection to determine for sure where the issue lies, i.e. why the resize is not happening in a timely fashion.
P.S. the issue title says "title-bar may not be resized" but it actually sounds (from later descriptions) like it is the opposite - the title bar does resize but the window contents do not.
Although from the scrots it looks like with window has resized (snapped to bottom) but the titlebar has not resized (or have I misunderstood the bug)
but it actually sounds (from later descriptions) like it is the opposite
In all my tests that resulted in the issue, the Qt windows were resized as expected, while their title-bars didn't. Also, see the screenshot(s).
but it actually sounds (from later descriptions) like it is the opposite
In all my tests that resulted in the issue, the Qt windows were resized as expected, while their title-bars didn't. Also, see the screenshot(s).
You’ve just answered my question :)
I really don't understand how this can happen, client bug or not. The client surface is resized (stretches the bottom screen) but the titlebar is not.
Even if we run into the timeout handler, a configure ACK that comes later and acknowledges a modified size (and ends up in handle_commit()
) gets checked if the newly commited surface size differs from the current
one. If yes we set update_required
to true
. That in turn calls view_impl_apply_geometry()
which in turn sets view->current.width
to the new surface size and then calls view_moved()
. That one then unconditionally calls ssd_update_geometry()
which in turn uses view->current.width
to update everything.
So even if we overwrite the pending serial in the timeout handler, we should still have a synced titlebar width and client surface width. I must be missing something somewhere.
I think we need some more debugging output to get a complete picture of what is going on. Working on that, will post a diff shortly.
@tsujan @stefonarch Can you kindly apply this logging patch and capture new logs when the issue occurs? https://gist.github.com/jlindgren90/dc6c37bd7c05635092e1d8f93744713b
What happens after the frame captured in the screenshots? Does the titlebar take on the right size in the next frame? Does the titlebar get stuck on the wrong size? If not, could you describe what happens.
Does the titlebar get stuck on the wrong size?
The title-bars didn't change at all, until I dragged them and un-tiled the windows. Imagine them as stable title-bars with wrong sizes, as long as their corresponding windows remained tiled (correctly).
Also, the bottom edge may have a role in this. Only once I saw this with the top edge, and I'm not sure whether I dragged the window or not. I can't reproduce it with the top edge.
Sorry, I can't do more tests for now, but I could answer questions about my previous tests.
What happens with the border that goes around the window? Does that have the same width as the titlebar or client?
Oh, this should be interesting to you -- just now, I realized something, to which I hadn't paid attention before:
As soon as the window becomes inactive, the title-bar is corrected.
The reason that I hadn't seen it before was that, in my previous tests, the title-bar didn't change when I changed the workspace. However, there was no other window on the other workspace to make the first window become inactive.
What happens with the border that goes around the window?
The borders of wrong sized title-bars fit them, as in the case of correct title-bars.
It almost sounds like the commit event gets stuck in a buffer somewhere and comes through later. What I don't understand is how the surface is visually the right size if the commit didn't come through.
As soon as the window becomes inactive, the title-bar is corrected.
That is exactly why I couldn't take a screenshot in a simple way.
To be as clear as possible, I complete my above sentence:
As soon as the window becomes inactive, the title-bar is corrected and remains corrected afterwards.
That is exactly why I couldn't take a screenshot in a simple way.
My method of taking simple screenshots doesn't deactivate windows. It's just a shortcut that creates the screenshots inside a folder.
With the logging patch from @jlindgren90 opened calibre from terminal, dragged it to the bottom edge, released, bug happened dragged it back to middle of screen, released, dragged it to the bottom edge, released, bug didn't happen labwc_bug.log
WAYLAND_DEBUG=1 from calibre dragged calibre to bottom of screen, released, bug happened, quit labwc from root menu calibre.log
Not sure if any of that is helpful.
It almost sounds like the commit event gets stuck in a buffer somewhere and comes through later. What I don't understand is how the surface is visually the right size if the commit didn't come through.
Yeah, basically this. And I asked myself the same.
As soon as the window becomes inactive, the title-bar is corrected and remains corrected afterwards.
Hm.. interesting. I thought that would just hit another code path that calls ssd_update_geometry()
but it actually doesn't seem to. Its just calling view_set_activated()
which in turn calls xdg_toplevel_view_set_activated()
(so it sends something to the client). I wonder if that somehow makes the commit event "unstuck".
Edit:
I think I understand what's going on here (or at least have a plausible theory). It looks like a Qt bug. And thanks @Flrian, the logs are extremely helpful (especially having both).
xdg-shell has a set_window_geometry
call that clients may use to set their logical "window geometry" to a sub-region of their xdg_surface. This is (I believe) intended for CSD windows to let the compositor know what portion of the surface is the "window" and what portion is decorations, drop shadows, etc.
It appears (from the calibre log) that Qt is sending a set_window_geometry
with its initial configure response, and at that time its requested window geometry matches the size of the surface (680x498).
[1096428.066] xdg_toplevel@37.configure(0, 0, array[4])
[1096428.070] xdg_surface@33.configure(125)
...
[1096428.273] -> wp_viewport@39.set_destination(680, 498)
[1096428.277] -> xdg_surface@33.set_window_geometry(0, 0, 680, 498)
[1096428.282] -> xdg_surface@33.ack_configure(125)
[1096428.647] -> wl_surface@36.attach(wl_buffer@40, 0, 0)
[1096428.656] -> wl_surface@36.damage_buffer(0, 0, 680, 498)
[1096428.661] -> wl_surface@36.commit()
When labwc later send a configure with a larger size (the tiled size, 1898x498) Qt resizes the actual surface but doesn't send a new set_window_geometry
call. So wlroots continues to use the old geometry (680x498) and that's what wlr_xdg_surface_get_geometry()
returns (this can be confirmed from the labwc log).
The actual surface is now larger (which you can see visually) but labwc continues to use the "window geometry" (which is now incorrect) to size the title bar and window borders.
[1100013.074] xdg_toplevel@37.configure(1898, 498, array[4])
[1100013.098] xdg_surface@33.configure(129)
[1100013.126] -> wp_viewport@39.set_destination(1898, 498)
[1100013.136] -> wl_compositor@6.create_region(new id wl_region@34)
[1100013.144] -> wl_region@34.add(0, 0, 1898, 498)
[1100013.152] -> wl_surface@36.set_opaque_region(wl_region@34)
[1100013.158] -> wl_region@34.destroy()
[1100013.165] -> xdg_surface@33.ack_configure(129)
...
[1100121.572] -> wl_surface@36.attach(wl_buffer@34, 0, 0)
[1100121.577] -> wl_surface@36.damage_buffer(0, 0, 1898, 498)
[1100121.581] -> wl_surface@36.commit()
I couldn't find any existing Qt bug report for this. If anyone has the time/motivation to create a simple Qt test case that can reproduce the bug, then capture logs and open a Qt bug report, that's probably the best way to get this really solved.
There are workarounds we could look into on the labwc side (e.g. detect when the surface size matches the configure request but the window geometry doesn't), not sure if we want to go that route or not.
Nice analysis!
When labwc later send a configure with a larger size (the tiled size, 1898x498) Qt resizes the actual surface but doesn't send a new
set_window_geometry
call. So wlroots continues to use the old geometry (680x498) and that's whatwlr_xdg_surface_get_geometry()
returns (this can be confirmed from the labwc log).
Hm. So either
wlr_xdg_surface_get_geometry()
is out of date (should have updated when wlroots received the new buffer with its different size or the new viewporter information)The scene graph implementation for xdg windows seems to handle this case, maybe we could look into that to get some inspiration.
Edit:
There is also the question of why we run into the timeout. I am not sure if xdg_surface@xx.ack_configure()
is causing a handle_commit()
call or if we have to wait for the wl_surface@xx.commit()
one instead.
Edit 2: From the xdg-shell protocol:
Once the window geometry of the surface is set, it is not possible to unset it, and it will remain the same until set_window_geometry is called again, even if a new subsurface or buffer is attached.
I am not sure if we are using the right way to get the current (xdg / wl _)surface size (although it seems like the only way when we also want to ignore CSD drop shadows and other CSD margins around the window).
And there is also the question of why sometimes it works correctly and sometimes it doesn't. A WAYLAND_DEBUG=1 log of a correctly sized titlebar after snapping might help here. And a second one for switching the focus somewhere else fixing it.
I think our code is correct (using the window geometry is the right thing to do, in principle). I had something like this in mind for a workaround: https://github.com/jlindgren90/labwc/commit/7a69cd2ed8a7a5b5787c7061193672e5658d1a47
In @Flrian's log, the commit did happen but was just a hair too late (116ms):
00:00:05.482 [ERROR] [../src/xdg.c:301] view 0x561ca4b66730 (calibre — || Calibre Library ||) current=680x498 pending=680x498 configure=1898x498
00:00:05.482 [ERROR] [../src/xdg.c:308] configure serial=26
00:00:05.482 [ERROR] [../src/xdg.c:149] set pending serial=26
00:00:05.482 [DEBUG] [types/output/cursor.c:432] Falling back to software cursor on output 'HDMI-A-1'
00:00:05.582 [INFO] [../src/xdg.c:127] client (calibre-gui) did not respond to configure request in 100 ms
00:00:05.582 [ERROR] [../src/xdg.c:129] view 0x561ca4b66730 (calibre — || Calibre Library ||) current=680x498 pending=1898x498 at timeout
00:00:05.582 [ERROR] [../src/xdg.c:133] pending serial=26
00:00:05.598 [ERROR] [../src/xdg.c:92] view 0x561ca4b66730 (calibre — || Calibre Library ||) current=680x498 pending=1898x498 commit=680x498
00:00:05.598 [ERROR] [../src/xdg.c:97] pending serial=0 commit=26
Your commit could be a solution for now, it looks sensible.
The remaining questions for me are:
xdg_toplevel_view_set_activated()
as result)I would love answers to those questions but suspect it would require some deep-dives into Qt source code. :(
I think we may get away with just WAYLAND_DEBUG=1 logs of
If we see that Qt then sends a new (correct) set_window_geometry()
then it is for sure a Qt bug and the reason why the titlebar is not resized is clear to me. If the logs do not show that, there is still something else going on that can't be explained by the stale window geometry alone.
I don't want to complicate your analysis by saying this (before I go to sleep), but I can't reproduce the bug when I use Kvantum's window dragging, instead of dragging the window from its title-bar. Kvantum calls QWindow::startSystemMove() by using an event filter installed on QWindow
, inside mouseMoveEvent
and after appropriate conditions are satisfied in mousePressEvent
.
I think it has something to do with the vertical position of the title-bar when the drop happens.
With Kvantum's window dragging, the cursor position is inside the window, and so, below the title-bar. Therefore, the title-bar is fully above the bottom edge of the screen and can even have a considerable distance from it on dropping.
With dragging from title-bar, the title-bar is partly below the bottom edge of the screen on dropping.
@tsujan @stefonarch @Flrian does the workaround help?
It happened here very randomly and rarely when I snapped a window to the bottom of the screen by dragging its title-bar and dropping it to the bottom edge. Fortunately, I took a screenshot:
I encountered it only once with the top edge and haven't seen it without drag-and-drop. However, I can't be sure that it's really related to drag-and-drop.
Sorry if it's already known or fixed in git; I use 0.6.5 and couldn't find it after a simple search.