Wayland support? - Githubissues

kasmtech / KasmVNC

Modern VNC Server and client, web based and secure

GNU General Public License v2.0

3.56k stars 318 forks source link

Wayland support? #193

Open secretmango opened 1 year ago

secretmango commented 1 year ago

Wayland is the default on GNOME and KDE.

Fedora KDE will not ship X11 anymore, X11 is on maintenance mode since many years.

Nowadays using Portals screensharing, mouse and keyboard input and many more VNC-related things can be done securely.

Does Kasm-VNC support Wayland?

A Bug from a year ago, when Wayland was "not quite ready" was closed without any notice. #106

I personally use Wayland for over a year.

mmcclaskey commented 1 year ago

We have an experimental build that can run wayland applications. Individual applications, not a full DE. With Wayland, the window manager (WM) and server are one in the same. The idea of KasmVNC working on Wayland and with every single DE and their own implementation of Wayland is simply not possible. Unfortunately, there is currently no way for an external program to merely hook into Wayland and do what VNC needs to do. KasmVNC essentially needs to be compiled into Wayland as a feature of Wayland. Getting that done on a single DE would take a lot of effort and collaboration, getting that done on every DE so that KasmVNC works with anything like it does with X11 would take an act of god. Each DE's Wayland stack are each rapidly evolving, making for several fast moving targets.

I'll keep the thread open should we have more feedback later.

Our friend over at TurboVNC has extensively looked into Wayland and you can see his thread here. KasmVNC and TurboVNC would be no different with respect to a Wayland implementation. https://github.com/TurboVNC/turbovnc/issues/18

Niek commented 1 year ago

@mmcclaskey what about wayvnc (https://github.com/any1/wayvnc)? They managed to build a vnc server for wlroots-based compositors, which are basically all serious compositors out there.

clbr commented 1 year ago

The majority of users will be using Gnome, KDE, Weston, or what Xfce will do in the future. The wlroots compositors are comparable to fluxbox/openbox on X, aka less wide usage. Supporting them as a plugin to them, like wayvnc does, is possible but may not make sense.

clbr commented 1 year ago

To add: you can already chain wayvnc to novnc and get a similar experience that way. It's of course regular VNC and not optimized kasm.

qdrop17 commented 11 months ago

The migration to Wayland is gaining momentum. Perhaps as an input (I'm not yet very deep into the technical details): Sunshine (https://github.com/LizardByte/Sunshine) uses a KMS screen capture. This is GPU-accelerated, agnostic of the DE, and used by many other tools (like OBS, for example).

Could this possibly be a workaround for the DE mode?

Ultimately, it also depends somewhat on the strategic direction of KasmVNC: Remote desktop or more remote apps / web canvas for apps?

clbr commented 11 months ago

KMS capture would capture the real screen (so not useful for running multiple instances on one machine) and require root permissions. It also doesn't cover the input side, which may still need target-specific implementation.

We already have a similar program for capturing the real X screen, so one could conceivably be made for a wayland real screen as well. A pass-through program if you will, which would require no modifications to kasm.

dcommander commented 11 months ago

@mmcclaskey WayVNC seems to be hooking into the compositor in a manner similar to how a hypothetical Wayland KasmVNC or TigerVNC or TurboVNC Server would need to do so. But I guess it's doing that in an implementation-specific way? I don't understand enough about the nuts and bolts to say for sure, but I would love to understand more about why WayVNC only works with wlroots-based compositors and why there isn't any kind of generic Wayland compositor interface for doing what it does.

If such a generic Wayland interface is truly hopeless, then I wonder aloud whether it might make sense for the various parties who are heavily invested in open source Xvnc implementations (including Cendio, Kasm, and myself) to collaborate on a high-performance VNC server library. This could involve extending LibVNCServer (which has a similar code base to the TurboVNC Server, owing to their shared TightVNC heritage) with the RFB congestion control algorithms and other extensions necessary to achieve feature parity with TurboVNC, KasmVNC, and TigerVNC on the server end, or it could involve implementing an entirely new VNC server library based on the TigerVNC classes. The idea would be to encourage GNOME and other Wayland implementors to implement VNC servers using this new library, so their VNC server implementations would perform optimally. In other words, if we can't bring Wayland to our VNC servers, then maybe we can bring our VNC servers to Wayland.

But I also wonder aloud whether it even makes sense to keep doubling down on the antiquated RFB protocol, which was designed around the limitations of 1980s graphics systems. If we were to design a remote display protocol from first principles to accommodate modern use cases, it could safely assume that all clients have double buffering capabilities. (RFB could not assume that, which is why it was designed as a client-pull protocol, and making it perform well on high-latency connections thus involves a lot more complexity than it should.) Such a protocol would also need to have seamless windowing capabilities, i.e. the ability to associate an image stream with a particular window ID and send remote window open/close commands. When you start talking about that, then it starts making sense to design a new Wayland compositor around the concept of remote display rather than trying to bolt existing remote display technologies onto existing Wayland compositors. The idea would be kind of like remote X11 in the sense that window management would occur on the client, but the contents of each window would be sent as a video stream rather than as a fine-grained series of drawing commands. If window management is occurring on the client, then we no longer have to care about compatibility with GNOME or KDE or any other server-side window manager. It seems like such a thing would eliminate the need for Xvnc and VirtualGL in one fell swoop, but again, I'm no expert on Wayland and don't really know if what I'm proposing is even possible. It would certainly be a herculean effort with no overlap whatsoever with any of our existing solutions.

any1 commented 11 months ago

WayVNC uses vendor wlroots specific wayland protocol extensions for the following: screen capturing, clipboard management, display resizing, and virtual pointer. Standard extensions are the virtual keyboard and transient seat protocols. Most of this functionality also exists in XDG desktop portal. See: https://flatpak.github.io/xdg-desktop-portal/docs/

The reason why I went with wlroots and vendor specific wayland extensions when I started wayvnc was that xdg-desptop-portal was incomplete at the time and I was trying to make things as efficient as possible for an embedded platform, so a close fit with the compositor was needed. I still don't think that the same level of efficiency can be achieved today by using xdg-desktop-portal instead of wlr-screencopy-v1. I have also been working on making better, more efficient and feature complete capturing protocol and getting it adopted as a standard protocol: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/124. Work has stalled a bit on it lately.

Now, xdg-desktop-portal works almost everywhere to a varying degree. It is the interface that I would target if I wanted to rush out a VNC server ASAP and have it be mostly compatible with most Wayland compositors. However, I would prefer to have standard wayland protocols for everything that's needed in the end. I just think it's weird to bolt dbus and pipewire on top, which is what xdg-desktop-portal does.

Some compositor developers don't believe in the extension approach. They just want to build everything into the compositor. Weston is one such compositor, where everything is built into the compositor. It uses Neat VNC, which is a new liberally licensed VNC library which I built for wayvnc. You might want to give it a look before you start building yet another one. LibVNCserver's biggest problem is that it's GPL licensed, and for many compositors, such as Weston, that's a deal-breaker as anything that's linked to it must itself comply with the GPL. Another problem is that the public API exposes internal implementation details to the user and makes it harder to make backwards compatible changes; it's not a very clean interface, but that's just my opinion. The license is the actual problem.

As for whether it makes sense to carry on the legacy of VNC into the future: I think that, for someone who's main line of work (that's not me) it is to maintain a remote desktop solution, it doesn't actually make sense. There are a few things to support this claim:

Wayland is fundamentally different to X11. Everything is about submitting buffers. The compositor composites and submits a complete buffer to the kernel, which has been composited from buffers that were submitted to it by clients. The buffers can be SHM buffers, but most of the time they're DMA-BUFs backed by GPU memory.
The most efficient way to encode the contents of a DMA-BUF is to pass it to a hardware encoder (e.g. via vaapi or v4l2m2m). All modern computers have those. glReadPixels is slow.
One of the biggest problems that I've run into when optimising wayvnc is that there are SO many clients out there and you can't tell people to just use the fast one that you like. The whole point of VNC is that it's standard, so people can use whatever. But that also means that they might just use something dog-slow and VNC as a whole gets the blame. With a new protocol, at least the reference implementation can be made good from the start.
Wayland offers some opportunities for improvements in how things are built. Presentation time is built into wayland, and this can also be built into a screen sharing protocol for smoother rendering and audio sync.

Compositing on the client side is an interesting idea. I'm not sure how much is pays off, but it would allow you to reduce the latency by one frame and it saves memory bandwidth on the server side. Not sure if it's really worth the effort thought. There's already something called waypipe that allows you to do that for a single application.

dcommander commented 11 months ago

@any1 Thanks for the excellent summary. I definitely learned something. I totally agree with your assessment that VNC is a legacy technology, the design of which was necessitated by the limitations of X11. The only reason it would make sense to use VNC in the near term is to fast track a solution by reusing the existing protocol layer as much as possible. I also totally agree with your assessment that there are too many VNC solutions out there, and most of them still use legacy RFB encodings (Hextile, zlib, ZRLE, RRE/CoRRE, etc.) that were made obsolete by the TurboVNC encoder 15 years ago. (I went out of my way to demonstrate that a stripped-down implementation of Tight encoding, combined with a high-speed JPEG codec, can provide both better compression and performance in all cases than those legacy encodings, which is why the TurboVNC Viewer doesn't even expose the legacy encodings in its GUI by default.) The VNC servers that use LibVNCServer can at least take advantage of the TurboVNC encoder for good performance on high-speed networks, assuming they use the TurboVNC or TigerVNC Viewer (or a viewer built with LibVNCClient.) However, TurboVNC and TigerVNC are the only VNC servers and viewers that implement the congestion control extensions, which are needed in order to achieve decent performance on high-latency networks. (KasmVNC's server implements those extensions, since it is based on TigerVNC. I'm not sure about their viewer, which is based on noVNC.)

From a technical point of view, it would be good to extend neatvnc so that it supports the performance and security enhancements from TigerVNC and TurboVNC, to make it possible to achieve a similar experience to the TurboVNC and TigerVNC Servers if someone chooses to use one of those viewers. However, the license is a bit of a sticking point. I doubt that all of the enhancements could be derived from first principles. (The congestion control code, in particular, is complicated and doesn't obviously follow from the public definitions of the RFB flow control extensions.) Relicensing the enhancements under the 1-clause ISC License would make it possible for our proprietary competition (RealVNC) to use the code, which would remove some of the competitive advantage that Cendio and I derive from working under the GPL. For me personally, the GPL and LGPL restrictions are a big reason why I am able to make money as an independent open source developer. If the traditional VNC code bases were more liberally licensed, then nothing would prevent a company from forking my code, taking it proprietary, and hiring their own developers (at a lower price point) to extend it rather than paying me as a contractor. It makes sense for lower-level libraries (such as libjpeg-turbo) to be licensed in a "business-friendly" manner. (In that case, most of my funded development comes from companies who want to use the library in proprietary code bases.) However, licensing all levels of the stack under a business-friendly license makes it harder for independent OSS developers such as myself to exist. That alone may be an impediment to extending the existing neatvnc code base rather than creating a GPL-licensed VNC library or extending LibVNCServer (but I also 100% agree with your assessment of LibVNCServer's clunky design.) Part of the impetus for creating or extending a VNC library is to "eat my own dog food", i.e. to use the library in the TurboVNC Server as well. But I understand that the proposed VNC library wouldn't be very useful for Wayland compositor developers if it retained the GPL license. It would mainly be useful from the point of view of developing a standalone Wayland VNC server that shares code with the existing TurboVNC Server.

What interests me most about Wayland is the possibility of seamless windows. Such isn't feasible with X11 unless you do something like what VirtualGL does-- send rendered images on a sideband but use remote X for everything else. Given that Wayland is all image-based, it seems possible to create an independent image stream for each window. I would be interested in your thoughts on how difficult it would be to implement such a solution, assuming we had access to RFB extensions that could send and receive basic window management commands (open, close, resize) and attach a window ID to each framebuffer update rectangle. Would it require a completely different compositor, or could an existing compositor handle it? Are there Wayland extensions that would allow for grabbing the pixels from individual windows instead of the whole display?

Sorry for the tome. I'm mostly thinking aloud, here.

dcommander commented 11 months ago

(My thinking is that, if we have seamless windows, we don't need to worry about supporting arbitrary compositors, and if we don't have to worry about supporting arbitrary compositors, then we don't have to worry about "business-friendly" licensing.)

any1 commented 11 months ago

@dcommander

To be clear, the fact the VNC is a legacy protocol is not necessarily a bad thing. It works well enough for many applications and it's ubiquitous. Plus, there's nothing that keeps your from extending it in any way you like. Well, if you wanted to add different transport mechanisms like UDP and/or Quic, then things might start getting a little bit silly. Nota bene, what drew me to VNC in the first place is the fact that all you have to do to make a minimal server is to implement a rather short and simple RFC. Of course, I wasn't satisfied with leaving it at that, so I've kept adding things to mine.

Would it require a completely different compositor, or could an existing compositor handle it? Are there Wayland extensions that would allow for grabbing the pixels from individual windows instead of the whole display?

There is no wayland extension that I know of which allows you to capture windows (toplevels). I think there's an interface for it in xdg-desktop-portal, but it's probably not very widely implemented. I was planning to make it a part of ext-screencopy-v1 eventually.

dcommander commented 11 months ago

@any1 Assuming such a Wayland extension existed, would a hypothetical seamless window VNC server be possible to implement at the level of wayvnc/neatvnc, or would it still require compositor modifications?

any1 commented 11 months ago

@dcommander I would say that it depends on how much additional information you'd need from the compositor. You'll probably need this https://wayland.app/protocols/ext-foreign-toplevel-list-v1 and this too https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/196/diffs. Then there's focus and input mapping, or alternatively figuring out the cursor position? You can also propose protocols and additions to existing protocols. I guess, in the end, it depends on your powers of persuasion.

dcommander commented 11 months ago

Thanks for reminding me about window focus. Now that I think about this in more detail, stacking order might be a major pain point. I was just envisioning that the compositor would transport the pixels for every window, regardless of whether the window was visible, and the client-side window manager would deal with stacking order and visibility. However, there are undoubtedly some operations that would behave in unexpected ways unless the stacking order and visibility on the client was synchronized with that of the server. So really it might be necessary to have a split compositor, an architecture similar to NX (which runs an X server both on the host and the client), in order to do seamless windows. That starts to sound really nasty, and the need to extend RFB with remote window management messages brings the level of nastiness far beyond what I'm willing to consider.

From a business point of view, the prospect of re-licensing all of the VNC performance and security extensions we currently use is likely a non-starter. I own only a portion of that code, and even if I got permission from the other copyright holders, I'm not sure if liberally licensing the code would be a good strategic move for any of us. From a purely technical point of view, It would also be easier to just extend LibVNCServer, since its code base is already similar to that of the TurboVNC Server, and create a GPL-licensed Wayland VNC server that uses xdg-desktop-portal or wlr-screencopy-v1 (depending on what's available in the compositor.) Once the idea of seamless windows goes away, then we're back to the original problem of needing a solution that will work with GNOME and KDE and other popular window managers. wlroots support would be nice to have as well, but the vast majority of my paying customers (i.e. companies and organizations that provide general funding or have sponsored TurboVNC features) use GNOME. GNOME's built-in remote desktop feature uses LibVNCServer, so maybe we can even get a GNOME-based Wayland VNC server for free by extending LibVNCServer (but correct me if I've misunderstood something there.) Weston support is not likely to be important to our user base at all. Anyhow, regardless of whether it ends up being useful for this specific project, improving LibVNCServer would have other benefits for the open source community, since it is used in multiple VNC server implementations.

All of this is purely a hypothetical conversation at the moment, because I don't have funding to work on any of the above. I'm mostly just sounding out the issues.

any1 commented 11 months ago

If you want to be "first to market" with a general VNC solution for Wayland, you might want to consider this: https://help.realvnc.com/hc/en-us/articles/14110635000221

Without providing an exact committed delivery date, we are happy to say that we’ll be back up and running on RPi 5 during the first half of 2024.

They'll probably be aiming to create a generic solution, not just targeting wlroots like wayvnc does. The only way in which they can achieve that is to use xdg-desktop-portal. I have a feeling that you'd be able to pull it off faster. I was thinking about creating an "xdovnc" solution myself based on neatvnc, but I don't have the time or motivation.

dcommander commented 11 months ago

As an independent open source developer, the only ways I make money are through patronage and funded development. The general fund that covers both VirtualGL and TurboVNC pays for about 300 hours/year of my labor to maintain and improve both projects, but it could easily take that much labor just to implement a bare-bones Wayland TurboVNC Server. It unfortunately isn't something I could tackle unless there is specific funding for the project. I haven't evaluated RealVNC recently, but as of a few years ago, its performance was about half of ours on a LAN and 1/5 of ours on a high-latency network (due to the lack of the congestion control algorithms, which are only available under the GPL.) Regardless, they have deep pockets that I don't have. TurboVNC's niche is and always has been high-performance 3D applications, reflecting the fact that TurboVNC was originally just a bolt-on solution for VirtualGL. RealVNC is focusing on other niches, and their niches have more of a burning need for Wayland than visualization/CAD/CAE software does at the moment.

One of TurboVNC's strengths is that it's kind of a Swiss Army knife for high-speed remote display, so you can use it in about 100 different ways. You can use the built-in session manager (which doesn't exist in any other OSS remote display solution, as far as I know) to start, stop, and connect to TurboVNC sessions via SSH. You can manage TurboVNC sessions via an in-house web portal that takes advantage of the TurboVNC Server's one-time password authentication feature. For added security or ease of authentication, you can use a Unix Domain Socket with SSH tunneling to avoid RFB/TCP entirely. You can use the built-in Websocket support in the TurboVNC Server, along with noVNC, to automatically serve up browser-based VNC viewers for a particular TurboVNC session. You can use TurboVNC with the UltraVNC Repeater. You can automatically use VirtualGL with every TurboVNC session, thus adding GPU acceleration for OpenGL applications. The list goes on. If an organization needs a particular feature (such as Wayland support), then they can pay me to implement that feature and be done with it, rather than paying hundreds or thousands of dollars per year for licenses of a proprietary product. Their investment is secure, because even if I get hit by a bus, they could hire someone to continue maintaining TurboVNC for them. I expect that, eventually, Wayland support will become enough of a pressing issue for the organizations that use TurboVNC that one of them will pay me to implement it. However, since those organizations tend to be users of software that rides on the tail end of the technology wave (as in, they tend to use the oldest version of RHEL that is still supported), it may be a while before the issue is pressing enough.