X Video support - Githubissues

dcommander commented 3 years ago

The primary application for this is video playback in TurboVNC sessions. If a video stream is encoded using 4:2:0 chroma subsampling, which is common, then currently that video stream will be decoded to RGB by the video player and re-encoded by the TurboVNC Server to the chroma subsampling level configured in the TurboVNC Viewer (4:4:4 if using the perceptually lossless preset, which is the default.) You could configure TurboVNC to use 4:2:0 subsampling, but that would introduce another level of color conversion loss. (Minimally, it would introduce more round-off error, but if the upper left of the video playback region doesn't fall on an even-numbered row and column relative to the desktop image, then additional artifacts would be introduced as well.) With X Video, however, the video player would instead decode a 4:2:0 video stream to a 4:2:0 YUV image, which the TurboVNC Server could directly re-encode as a 4:2:0 JPEG image. On fast networks, this approach would reduce the CPU usage of the server, since two color conversion steps could be eliminated and since the color counting step in the Tight encoder could be bypassed. On slow networks, this approach would allow a 4:2:0 RFB stream to be sent for the video window, thus reducing network usage (relative to TurboVNC's perceptually lossless or medium-quality encoding methods) without incurring any additional color conversion loss.

The TurboVNC Server could also allow different compression settings or even a different codec to be used for X Video, which would allow the image quality of video playback to be controlled independently of the image quality of the rest of the remote desktop. Referring to #19, this is also a logical touch point for H.264, since it would avoid the pitfalls of encoding the entire desktop using an interframe codec (which is really inefficient unless most of the desktop image changes frequently, which is only usually the case for full-screen video playback or games.) It would still probably be necessary to use GPU-based H.264 encoding in order to achieve decent performance on fast networks, though. (Referring to https://turbovnc.org/About/H264, libx264 is generally too slow for our purposes.) However, it would also be possible for the TurboVNC Server to implement its own interframe compression mechanism just for video playback, thus allowing it to recapture some of the advantages of H.264 while continuing to use the widely-supported Tight RFB encoding type.

Since VirtualGL can optionally use X Video, everything said above about video playback would also apply to OpenGL applications with VirtualGL. (VirtualGL is currently limited to 4:2:0 with X Video, but there is no reason why it couldn't be extended to support 4:2:2 and 4:4:4, as long as the X Video implementation can receive planar 4:2:2 and 4:4:4 images.)

dcommander commented 3 years ago

NOTE: The most straightforward way to implement this feature would be to plug it into the existing RFB flow control mechanism, which would automatically drop frames as necessary to avoid interaction delays. However, there is also the potential for implementing a mechanism that automatically dials down video quality to avoid frame dropping. Basically, once we are encoding a video playback window separately from the rest of the desktop, we can do a variety of things to improve the performance of video playback that wouldn't make sense for the desktop as a whole.

dcommander commented 3 years ago

Upon further investigation, it appears that planar 4:2:2 and 4:4:4 video formats are not really supported by X Video (or, at least, the most widespread implementation of it.) Thus, it would probably be necessary for any potential X Video implementation in the TurboVNC Server to convert packed to planar in order to support 4:2:2, and I don't think it could support 4:4:4 at all. That reduces the potential usefulness of the feature from the point of view of VirtualGL.

dcommander commented 2 years ago

It occurs to me that there is another useful aspect of this feature from the point of view of video playback. If a video was being displayed into an X window larger than the size of the video stream, then normally the video would be decoded and scaled up to fit the window, then the contents of the window would be re-encoded by TurboVNC. It would be much more efficient, both in terms of server CPU usage and network usage, to transmit the video in its native size and scale it up on the client. I believe that a working X Video extension would allow us to do that.

TurboVNC / turbovnc

X Video support #291