emersion / xdg-desktop-portal-wlr

xdg-desktop-portal backend for wlroots
MIT License
591 stars 56 forks source link

Laggy mouse when screen sharing on 4K resolution #66

Closed zsolt-donca closed 3 years ago

zsolt-donca commented 3 years ago

When starting screen sharing from a browser with WebRTC, the mouse cursor gains a significant lag on the display that is being shared. My other screen doesn't seem to be affected, the mouse cursor works just fine over there, even while the screen sharing is being active. Stopping the screen sharing makes the lag disappear.

I am using two 4K displays. This is probably related to the 4K resolution, as if I switch the resolution of one of the displays to 1920x1080 and share that display, the mouse cursor is moving perfectly smooth.

This lag also feels when typing on the keyboard, so I think it's a general input lag problem, and not just specific to the mouse.

I made a screenshot of htop with the screen sharing being active on the other screen. It It shows sway, firefox, xdg-desktop-portal-wlr being the biggest CPU users:

screenshot_2020-11-09_19-30-31_735861790

For comparison, the mouse is not laggy when on GNOME in Wayland mode (tested with a Fedora 33 live image), nor in Xorg mode. I wouldn't mind the relatively high CPU usage if there wasn't this input lag.

Another related question: why can I share only one of my displays? When the browser prompts, there is only a single display being available. It happens to be my "main" display, so actually it's fine for me, I'm just curious.

I also mentioned this in #65 .

emersion commented 3 years ago

Screen sharing disables hw cursors.

https://github.com/swaywm/wlroots/issues/2091 may help wrt. Sway CPU usage.

zsolt-donca commented 3 years ago

I see, I guess making the operations asynchronous will help also with the input lag issue. I wonder in what stage that ticket and the associated MR is, but I will ask in there.

Just for the record, GNOME's Wayland compositor (Mutter, I think) doesn't have this issue. I guess somebody could take a look at how this problem is solved over there, maybe the same solution applies here as well.

What about the fact that there is only a single monitor available for screen share? Is that a known limitation? If not, I would happily open another ticket with that issue also. UPDATE: In the meantime I've found #12 which I believe is exactly about this inability to select the output for sharing.

I'll keep this questions open for a couple of days, incase some further insight comes with regards to the input lag issue.

emersion commented 3 years ago

Maybe GNOME never disables hw cursors when screensharing, but I don't think it should make that big of a difference...

solarkraft commented 3 years ago

I have the same issue and while I don't understand much about the technology behind it I've made some observations that may be note worthy:

I don't really care much about sharing my screen at full 4K resolution, by the way, it only worsens the compression when the video goes through the network. Maybe there is any way to do down scaling early to improve performance?

danshick commented 3 years ago
  • Recording using wf-recorder is much smoother

wf-recorder may default to export-dmabuf these days. Or ratelimit the screen capture. Not sure. We request frames as fast as the compositor can give them (after we copy them for pipewire). Could be a performance bottleneck. It would be a good idea to add some debugging output regarding framerate.

Maybe there is any way to do down scaling early to improve performance?

You could always change your output geometry first. That's probably the best option for performance. We're trying to do minimal video processing in xdpw (all we do now is y_invert the frames as they come in upside down)

emersion commented 3 years ago

wf-recorder may default to export-dmabuf these days

No, wf-recorder can only pump out shm frames from the compositor.

zsolt-donca commented 3 years ago

Sorry for insisting on this subject, but I still don't get it: what is causing the mouse lag? If there is no significant CPU nor GPU usage, then what is it? Is there a way for me to do debugging (maybe active some debug logs), or some kind of profiling to figure it out?

Also, if it's related to the disabling of the hardware mouse cursor, is there any way of working around it? Also, what is the reason for it? Is it that otherwise the cursor wouldn't appear on the recording?

danshick commented 3 years ago

My only guess would be that calling screencopy with a software cursor and no rate limiting is slowing down the rendering loop, but I honestly have no idea. I don't have any 4k displays and I can't reproduce at 1920x1080. You could try adding an artificial delay between frames. I'm pretty sure wayvnc does this if you are looking for an example.

You can enable debug (or even trace) logging if you like. You need to start xdpw manually as described in the FAQ and the help text will tell you what options you have. Trace logs do print on every single frame, but we don't include timing information, so you' have to do something clever to determine a frame rate. You could also add some code to calculate a running frame rate. Again, something similar is done in wayvnc.

zsolt-donca commented 3 years ago

@danshick You are probably right on this one: I just tried wayvnc, and the mouse lag is almost non-existing (maybe it's not even there at all and I'm just imagining it). I checked out the source code also, and the delay seems to be implemented in a pretty straightforward way. I will give it a try.

danshick commented 3 years ago

... the delay seems to be implemented in a pretty straightforward way. I will give it a try.

Let me know how it goes. We'd happily accept a PR to rate limit the frames if it helps.

scutze commented 3 years ago

I had the same problem with input from a wacom pen. Drawing whilst screen sharing on 4K led to lag and jerky lines. zsolt-donca's implementation seems to solve the issue for me.

danshick commented 3 years ago

Thanks for the feedback. As soon as one of us has time to review that PR, it will be considered for merging.

nagisa commented 3 years ago

Adding some more information – my screen is not quite 4k, but has higher resolution than 2k. I found that when just one webrtc capturing session is active (via pipewire/wlr portal) then the slowdown is not very noticeable, but gets worse as more capture sessions are started.

My very naive guess, without looking at the code, would be that the frame buffers are being re-acquired anew for each stream/session/whatever they are called, thus slowing down sway. It that is indeed the case, I imagine that this could be mitigated to an extent by copying buffers inside the portal for each session?

emersion commented 3 years ago

Ah, that's a lot of pixels for CPU copies.

https://github.com/swaywm/wlroots/issues/2091 should help the compositor.

danshick commented 3 years ago

My very naive guess, without looking at the code, would be that the frame buffers are being re-acquired anew for each stream/session/whatever they are called, thus slowing down sway. It that is indeed the case, I imagine that this could be mitigated to an extent by copying buffers inside the portal for each session?

Actually, in xdpw, as long as you are streaming the exact same output with the exact same settings (at the moment, this just means pixel format and cursor on or off, until output selection is implemented) we only push the contents of the buffer to pipewire once. Each new session that is started shares the exact same pipewire node, so there is no additional copying on our side with each additional session.

That said, what the downstream apps do with those buffers is likely to add latency regarding how quickly pipewire can recycle those buffers. I wouldn't expect this to cause noticeable jitter in your compositor though. That's surprising.

Thanks for the info.

zsolt-donca commented 3 years ago

To anybody affected by this issue: https://github.com/emersion/xdg-desktop-portal-wlr/pull/74 was merged that allows you to control the rate with which the frames are captured, reducing the load on the compositor and thus mitigating this issue. It is not a proper solution, as you sacrifice the ability to faithfully capture your screen.

To apply the rate control, you either need to manually start xdg-desktop-portal-wlr with the -f N argument, where N is the desired frames per seconds value, e.g. -f 10 for 10 frames per second. Alternatively, you can create a config file at ~/.config/xdg-desktop-portal-wlr/config with something similar:

[screencast]
output_name=
max_fps=10
danshick commented 3 years ago

Thanks @zsolt-donca for your great work! I'll close this issue as this is the best solution we could ask for until #9 is resolved.