Closed OJFord closed 1 year ago
Hm, that's odd. What Wayland compositor are you using? Could you try reproducing it on another compositor? Could you try reproducing it by spamming wl-copy
and wl-paste
perhaps? If you are able to make a consistent reproducer, then WAYLAND_DEBUG=1
logs for the wl-clipboard processes would be helpful.
I'm using sway; I've never used anything else but I can give it a go.. imagine I can just install and launch one from within sway (since I realised accidentally recently that sway will happily start a new sway instance within itself (🤯))?
I'll see if spamming reproduces.
I'll also just add that to the env for systemd service, since it 'reliably' reproduces a few times daily, thanks.
imagine I can just install and launch one from within sway
Yeah, you can do that with weston
and also with mutter --nested
. Trying other wlroots-based compositors is probably gonna yield the same result, so I'd test weston and mutter.
Ok here's the WAYLAND_DEBUG=1
log: failed.log
15:36 is where I noticed it not working; so I assume whatever happened at 15:29 worked as expected.
Status looks as in OP, just different PIDs obviously. wl-clipboard 2.1.0
.
It seems this might be triggered by certain paste 'targets' (I don't how you'd term it - the focussed area when I paste, via XF86Paste
). ... Which seems odd to me, and perhaps isn't the case, since that seems unrelated to watching for copies and pasting them by a different mechanism in the background.
I just had some very uncanny timing where I pasted something into a chat support text field (which itself necessitated restarting the service) and then went to select ('primary selection') something else to paste (from clipboard) and it had already stopped working, even though I'd only just restarted it.
Did you have a chance to test other compositors? Might be an issue in wlroots' Xwayland integration.
I've also been experiencing this issue a few times each day for a while, but I haven't noticed a pattern of when it fails. Did anyone find a reliable way of reproducing this? Happy to look into other compositors if so.
Sorry, I haven't yet tried another compositor, I keep meaning to, I will get to it, but any help appreciated.
It does seem to be very reproducible with an app of my own that uses Tauri - I paste, and I need to restart, can't paste again, I get one paste per restart within the app.
It also seems to affect Firefox URL bar but only occasionally in-page elements.
Hmm interesting, good to know it's a bit reproducible this way, but it would be nice to get to a point where we have a minimal example script that reproduces the behaviour.
Agreed of course, I just mentioned it in case it made someone think 'ah sounds like they're both foobars' - difference between Firefox URL bar and page is interesting in particular I think.
Also because, re:
Might be an issue in wlroots' Xwayland integration.
Firefox isn't using Xwayland.
Just had a look at reproducing with weston or mutter @YaLTeR - both error with --watch
that it requires a compositor that supports wlroots data control protocol. Additionally weston errors with --primary
that it does not seem to support primary selection.
Anything else I can try?
both error with --watch that it requires a compositor that supports wlroots data control protocol
Oh, right, forgot about that. Uhh, not sure then actually, do Smithay-based compositors support wlr-data-control?
As far as I can tell (grepping the repo), no, but I haven't tested it because the only public Smithay-based compositor I can find isn't anywhere near complete enough to make that straightforward.
Anything else I can try here? I'm not sure why it's labelled 'question', seems pretty clearly a bug somewhere (even if not in wl-clipboard) or buggy interaction between things to me?
Anything else I can try here?
Not sure, sorry. I kinda hope @bugaevc will have a better idea here once he's back (which should be relatively soon hopefully).
I'm not sure why it's labelled 'question', seems pretty clearly a bug somewhere
Well, a bug somewhere, and we're not sure where, hence the question.
I can consistently reproduce this in sway 1.8 with qutebrowser 2.5.2 by taking a long time to finish my selection with the mouse. That is, push the mouse button down, move back and forth for a second or so, then release the mouse button. With export WAYLAND_DEBUG=1; wl-paste -pw wl-copy
and every application aside from qutebrowser, I don't get any output while moving the mouse around, followed by a bunch when I release it. With qutebrowser, there's constant output while I'm moving the mouse. The only other Qt/python application I have is calibre, and it does not reproduce this behavior.
Logs attached. failed.log success.log
I kinda hope @bugaevc will have a better idea here once he's back
Hi all! and sorry for taking this long to look at this.
I can consistently reproduce this in sway 1.8 with qutebrowser 2.5.2 by taking a long time to finish my selection with the mouse. That is, push the mouse button down, move back and forth for a second or so, then release the mouse button
Thank you, I was able to reproduce that!
├─ 56846 /usr/bin/wl-paste --primary --watch /usr/bin/wl-copy ├─ 70710 /usr/bin/wl-copy ├─ 70713 /usr/bin/wl-copy └─ 70716 catWhich seems like an 'extra'wl-copy
andcat
.
So the fact that you're seeing two instances of wl-copy
is "normal", as in it itself is not a bug. The reason for this is that while wl-copy
invocation seems to complete/exit instantly, in reality wl-copy
forks off into the background and "serves" the copy requests from other clients; it automatically quits once it sees a different data offer being set as the selection (so, something else having been copied). You can see this for yourself if you run it with --foreground
, in which case it will do the same things but without actually forking into background. (Don't try to use --foreground
when running wl-copy
from wl-paste --watch
though; that will break/deadlock.)
And so, if you run wl-paste -pw wl-copy
, what should normally happen is:
wl-copy
in the background, serving copy requests;wl-paste -pw
spawns a new instance of wl-copy
;wl-copy
sees this and quits;In your case, we're stuck at step 3: there is the old wl-copy
and the new wl-copy
(compare the PIDs to figure out which one is which), but we're not proceeding to step 4 for some reason. The cat
child you're seeing is the cat
that wl-copy
spawns during startup to copy its stdin into a temp file (see dump_stdin_into_a_temp_file()
). This cat
evidently does not complete, and so the new wl-copy
does not proceed further (to step 4).
Why does cat
not complete? Likely, because the other end of the pipe never got closed. I have not investigated why, but I can speculate that is because we're hitting some unusual situation inside either the compositor (Sway/wlroots) or the client (qutebrowser).
Interestingly, wl-paste
itself is in a weird state: namely, in a recursive/reentrant invocation of selection_callback()
because of wl_display_roundtrip()
. There was a related fix recently, https://github.com/bugaevc/wl-clipboard/pull/133 (cc'ing @mahkoh), but that (intentionally) did not affect the watch mode. This, first of all, explains why this bug only shows up sometimes: there's a race condition that we have to hit for the reentrancy to trigger. And it sounds plausible that us not destroying the previous offer (in the outer invocation of selection_callback()
) while asking for the new offer's data (in the nested invocation of selection_callback()
) messes with the expectations of qutebrowser, causing it to not send the data for the new offer properly.
The following tentative patch, which prevents dispatching events in the middle of selection_callback()
, seems to solve the issue for me:
diff --git a/src/wl-paste.c b/src/wl-paste.c
index e384216..2c702cc 100644
--- a/src/wl-paste.c
+++ b/src/wl-paste.c
@@ -238,7 +238,8 @@ static void selection_callback(struct offer *offer, int primary) {
popup_surface_destroy(popup_surface);
popup_surface = NULL;
}
- wl_display_roundtrip(wl_display);
+ wl_display_flush(wl_display);
/* Spawn a cat to perform the copy.
* If watch mode is active, we spawn
@OJFord @jsbronder could you please test whether that works for you?
That being said, I'm not sure this patch is good:
EAGAIN
, we should do something akin to wl_display_poll()
in libwayland-client — but then would that really work if we've written so much stuff that the socket buffer is full, yet we're not reading any incoming messages?Thank you for the detailed reply @bugaevc, I'll certainly give that a go.
Thanks also for the reproduction @jsbronder! I appreciate it's not a lot to go on without a reasonable way of reproducing, I just wasn't able to find a contained way of doing so, so I'm very glad you did.
@bugaevc I reproduced the bug again having built from master at time of writing, d83a629822d83b73ffa0f2a6e1c82b1caab30b8d, and then applied your patch above and rebuilt. I don't want to speak to soon, but that seems to fix it for me too, thanks!
The following tentative patch, which prevents dispatching events in the middle of
selection_callback()
, seems to solve the issue for me:diff --git a/src/wl-paste.c b/src/wl-paste.c index e384216..2c702cc 100644 --- a/src/wl-paste.c +++ b/src/wl-paste.c @@ -238,7 +238,8 @@ static void selection_callback(struct offer *offer, int primary) { popup_surface_destroy(popup_surface); popup_surface = NULL; } - wl_display_roundtrip(wl_display); + wl_display_flush(wl_display); /* Spawn a cat to perform the copy. * If watch mode is active, we spawn
@OJFord @jsbronder could you please test whether that works for you?
Sorry for the delayed response, I was on holiday. I ran with the above on d83a629822d83b73ffa0f2a6e1c82b1caab30b8d all day today, occasionally comparing both clipboards and they do appear to have stayed synced.
@bugaevc your patch (applied to the latest Debian package 2.1.0-0.1
) also solves this issue for me (sway 1.8.1
on Debian Unstable
)
Thanks everyone for testing! :slightly_smiling_face: Since there were no regressions reported, I went ahead and committed the patch.
I run a systemd user service to keep my primary selection and clipboard in sync; after a while it stops working, manifesting to me as 'I can't paste anything [my selections]'.
When it stops working the status looks like:
Which seems like an 'extra'
wl-copy
andcat
. Restarting the service resolves the issue, and I just now found that killing thecat
process works too.Let me know if there's anything I can try (when it next occurs) to be more helpful, I realise that's not a lot to go on.