bugaevc / wl-clipboard

Command-line copy/paste utilities for Wayland
GNU General Public License v3.0
1.62k stars 59 forks source link

--watch gives up after a while #137

Closed OJFord closed 1 year ago

OJFord commented 2 years ago

I run a systemd user service to keep my primary selection and clipboard in sync; after a while it stops working, manifesting to me as 'I can't paste anything [my selections]'.

When it stops working the status looks like:

     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/primary-clipboard-sync.service
             ├─ 56846 /usr/bin/wl-paste --primary --watch /usr/bin/wl-copy
             ├─ 70710 /usr/bin/wl-copy
             ├─ 70713 /usr/bin/wl-copy
             └─ 70716 cat

Which seems like an 'extra' wl-copy and cat. Restarting the service resolves the issue, and I just now found that killing the cat process works too.

Let me know if there's anything I can try (when it next occurs) to be more helpful, I realise that's not a lot to go on.

YaLTeR commented 2 years ago

Hm, that's odd. What Wayland compositor are you using? Could you try reproducing it on another compositor? Could you try reproducing it by spamming wl-copy and wl-paste perhaps? If you are able to make a consistent reproducer, then WAYLAND_DEBUG=1 logs for the wl-clipboard processes would be helpful.

OJFord commented 2 years ago

I'm using sway; I've never used anything else but I can give it a go.. imagine I can just install and launch one from within sway (since I realised accidentally recently that sway will happily start a new sway instance within itself (🤯))?

I'll see if spamming reproduces.

I'll also just add that to the env for systemd service, since it 'reliably' reproduces a few times daily, thanks.

YaLTeR commented 2 years ago

imagine I can just install and launch one from within sway

Yeah, you can do that with weston and also with mutter --nested. Trying other wlroots-based compositors is probably gonna yield the same result, so I'd test weston and mutter.

OJFord commented 2 years ago

Ok here's the WAYLAND_DEBUG=1 log: failed.log

15:36 is where I noticed it not working; so I assume whatever happened at 15:29 worked as expected.

Status looks as in OP, just different PIDs obviously. wl-clipboard 2.1.0.

OJFord commented 2 years ago

It seems this might be triggered by certain paste 'targets' (I don't how you'd term it - the focussed area when I paste, via XF86Paste). ... Which seems odd to me, and perhaps isn't the case, since that seems unrelated to watching for copies and pasting them by a different mechanism in the background.

I just had some very uncanny timing where I pasted something into a chat support text field (which itself necessitated restarting the service) and then went to select ('primary selection') something else to paste (from clipboard) and it had already stopped working, even though I'd only just restarted it.

YaLTeR commented 2 years ago

Did you have a chance to test other compositors? Might be an issue in wlroots' Xwayland integration.

eddsalkield commented 2 years ago

I've also been experiencing this issue a few times each day for a while, but I haven't noticed a pattern of when it fails. Did anyone find a reliable way of reproducing this? Happy to look into other compositors if so.

OJFord commented 2 years ago

Sorry, I haven't yet tried another compositor, I keep meaning to, I will get to it, but any help appreciated.

It does seem to be very reproducible with an app of my own that uses Tauri - I paste, and I need to restart, can't paste again, I get one paste per restart within the app.

It also seems to affect Firefox URL bar but only occasionally in-page elements.

eddsalkield commented 2 years ago

Hmm interesting, good to know it's a bit reproducible this way, but it would be nice to get to a point where we have a minimal example script that reproduces the behaviour.

OJFord commented 2 years ago

Agreed of course, I just mentioned it in case it made someone think 'ah sounds like they're both foobars' - difference between Firefox URL bar and page is interesting in particular I think.

Also because, re:

Might be an issue in wlroots' Xwayland integration.

Firefox isn't using Xwayland.

OJFord commented 2 years ago

Just had a look at reproducing with weston or mutter @YaLTeR - both error with --watch that it requires a compositor that supports wlroots data control protocol. Additionally weston errors with --primary that it does not seem to support primary selection.

Anything else I can try?

YaLTeR commented 2 years ago

both error with --watch that it requires a compositor that supports wlroots data control protocol

Oh, right, forgot about that. Uhh, not sure then actually, do Smithay-based compositors support wlr-data-control?

OJFord commented 2 years ago

As far as I can tell (grepping the repo), no, but I haven't tested it because the only public Smithay-based compositor I can find isn't anywhere near complete enough to make that straightforward.

OJFord commented 2 years ago

Anything else I can try here? I'm not sure why it's labelled 'question', seems pretty clearly a bug somewhere (even if not in wl-clipboard) or buggy interaction between things to me?

YaLTeR commented 2 years ago

Anything else I can try here?

Not sure, sorry. I kinda hope @bugaevc will have a better idea here once he's back (which should be relatively soon hopefully).

I'm not sure why it's labelled 'question', seems pretty clearly a bug somewhere

Well, a bug somewhere, and we're not sure where, hence the question.

jsbronder commented 1 year ago

I can consistently reproduce this in sway 1.8 with qutebrowser 2.5.2 by taking a long time to finish my selection with the mouse. That is, push the mouse button down, move back and forth for a second or so, then release the mouse button. With export WAYLAND_DEBUG=1; wl-paste -pw wl-copy and every application aside from qutebrowser, I don't get any output while moving the mouse around, followed by a bunch when I release it. With qutebrowser, there's constant output while I'm moving the mouse. The only other Qt/python application I have is calibre, and it does not reproduce this behavior.

Logs attached. failed.log success.log

bugaevc commented 1 year ago

I kinda hope @bugaevc will have a better idea here once he's back

Hi all! and sorry for taking this long to look at this.

I can consistently reproduce this in sway 1.8 with qutebrowser 2.5.2 by taking a long time to finish my selection with the mouse. That is, push the mouse button down, move back and forth for a second or so, then release the mouse button

Thank you, I was able to reproduce that!

             ├─ 56846 /usr/bin/wl-paste --primary --watch /usr/bin/wl-copy
             ├─ 70710 /usr/bin/wl-copy
             ├─ 70713 /usr/bin/wl-copy
             └─ 70716 cat
Which seems like an 'extra' wl-copy and cat.

So the fact that you're seeing two instances of wl-copy is "normal", as in it itself is not a bug. The reason for this is that while wl-copy invocation seems to complete/exit instantly, in reality wl-copy forks off into the background and "serves" the copy requests from other clients; it automatically quits once it sees a different data offer being set as the selection (so, something else having been copied). You can see this for yourself if you run it with --foreground, in which case it will do the same things but without actually forking into background. (Don't try to use --foreground when running wl-copy from wl-paste --watch though; that will break/deadlock.)

And so, if you run wl-paste -pw wl-copy, what should normally happen is:

  1. There is an instance of wl-copy in the background, serving copy requests;
  2. Something is copied into the primary clipboard;
  3. wl-paste -pw spawns a new instance of wl-copy;
  4. That new instance sets its source as the new selection;
  5. The old instance of wl-copy sees this and quits;
  6. GOTO 1.

In your case, we're stuck at step 3: there is the old wl-copy and the new wl-copy (compare the PIDs to figure out which one is which), but we're not proceeding to step 4 for some reason. The cat child you're seeing is the cat that wl-copy spawns during startup to copy its stdin into a temp file (see dump_stdin_into_a_temp_file()). This cat evidently does not complete, and so the new wl-copy does not proceed further (to step 4).

Why does cat not complete? Likely, because the other end of the pipe never got closed. I have not investigated why, but I can speculate that is because we're hitting some unusual situation inside either the compositor (Sway/wlroots) or the client (qutebrowser).

Interestingly, wl-paste itself is in a weird state: namely, in a recursive/reentrant invocation of selection_callback() because of wl_display_roundtrip(). There was a related fix recently, https://github.com/bugaevc/wl-clipboard/pull/133 (cc'ing @mahkoh), but that (intentionally) did not affect the watch mode. This, first of all, explains why this bug only shows up sometimes: there's a race condition that we have to hit for the reentrancy to trigger. And it sounds plausible that us not destroying the previous offer (in the outer invocation of selection_callback()) while asking for the new offer's data (in the nested invocation of selection_callback()) messes with the expectations of qutebrowser, causing it to not send the data for the new offer properly.

The following tentative patch, which prevents dispatching events in the middle of selection_callback(), seems to solve the issue for me:

diff --git a/src/wl-paste.c b/src/wl-paste.c
index e384216..2c702cc 100644
--- a/src/wl-paste.c
+++ b/src/wl-paste.c
@@ -238,7 +238,8 @@ static void selection_callback(struct offer *offer, int primary) {
         popup_surface_destroy(popup_surface);
         popup_surface = NULL;
     }
-    wl_display_roundtrip(wl_display);
+    wl_display_flush(wl_display);

     /* Spawn a cat to perform the copy.
      * If watch mode is active, we spawn

@OJFord @jsbronder could you please test whether that works for you?

That being said, I'm not sure this patch is good:

OJFord commented 1 year ago

Thank you for the detailed reply @bugaevc, I'll certainly give that a go.

Thanks also for the reproduction @jsbronder! I appreciate it's not a lot to go on without a reasonable way of reproducing, I just wasn't able to find a contained way of doing so, so I'm very glad you did.

OJFord commented 1 year ago

@bugaevc I reproduced the bug again having built from master at time of writing, d83a629822d83b73ffa0f2a6e1c82b1caab30b8d, and then applied your patch above and rebuilt. I don't want to speak to soon, but that seems to fix it for me too, thanks!

jsbronder commented 1 year ago

The following tentative patch, which prevents dispatching events in the middle of selection_callback(), seems to solve the issue for me:

diff --git a/src/wl-paste.c b/src/wl-paste.c
index e384216..2c702cc 100644
--- a/src/wl-paste.c
+++ b/src/wl-paste.c
@@ -238,7 +238,8 @@ static void selection_callback(struct offer *offer, int primary) {
         popup_surface_destroy(popup_surface);
         popup_surface = NULL;
     }
-    wl_display_roundtrip(wl_display);
+    wl_display_flush(wl_display);

     /* Spawn a cat to perform the copy.
      * If watch mode is active, we spawn

@OJFord @jsbronder could you please test whether that works for you?

Sorry for the delayed response, I was on holiday. I ran with the above on d83a629822d83b73ffa0f2a6e1c82b1caab30b8d all day today, occasionally comparing both clipboards and they do appear to have stayed synced.

hvhaugwitz commented 1 year ago

@bugaevc your patch (applied to the latest Debian package 2.1.0-0.1) also solves this issue for me (sway 1.8.1 on Debian Unstable)

bugaevc commented 1 year ago

Thanks everyone for testing! :slightly_smiling_face: Since there were no regressions reported, I went ahead and committed the patch.