TigerVNC / tigervnc

High performance, multi-platform VNC client and server
https://tigervnc.org
GNU General Public License v2.0
5.04k stars 931 forks source link

Vulkan applications don't work in Xvnc #1674

Closed DocMAX closed 7 months ago

DocMAX commented 12 months ago

is it possible to add vulkan support in an Xvnc server?

CendioOssman commented 12 months ago

I'm not terribly familiar with Vulkan, but I would assume it has a software fallback just like OpenGL that should work fine in Xvnc. What errors are you seeing?

CendioOssman commented 10 months ago

No response. Closing.

DocMAX commented 9 months ago

Sorry, can you reopen? The problem is simply this in a xvnc session:

grafik

CendioOssman commented 9 months ago

Thanks, that's a bit more clear.

The DRI3 complaint seems to be just a warning, so I am not convinced that's what's causing things to fail. Will need to investigate more.

CendioOssman commented 9 months ago

vkcube works for me here, so it might be application-specific issues. I had to specify --gpu_number 1 for it to choose the software fallback, though.

DocMAX commented 9 months ago

grafik

Not working here. It works with my Xrdp server. I remember i had to enable "glamor" for this, but i don't remember how i did it. Maybe the same is needed for Xvnc server?

CendioOssman commented 9 months ago

That's odd. It doesn't look like you have a CPU fallback on that system. GPU 0 is clearly a real GPU. Perhaps some more packages need to be installed?

This is what should be expected:

$ vkcube --gpu_number 1
Selected GPU 1: llvmpipe (LLVM 16.0.6, 256 bits), type: Cpu
DrasLorus commented 8 months ago

Hello! I have kind of the same issue. I have a VirtualGL enabled PC, thus Glxgears run at around 5k FPS on my AMD RX7900XTX using radeonsi driver.

When I run vkcube, I either get:

Is there a way to get Vulkan calls or DRI3 working? Or do I misunderstand something?

Note: the server has gone to sleep thanks to GDM during testing, and I will not be able to restart it before at least a week.

CendioOssman commented 8 months ago

Hardware acceleration is an entirely different story, so let's focus on just getting basic software rendering up and running.

It sounds like that is working for you, though. What about vulkaninfo?

It looks like the current issues are:

DrasLorus commented 8 months ago

Hi again !

Understood for hardware acceleration. I will wait.

Concerning the current issue, I can confirm that when llvmpipe is used, vkcube --gpu_number 2 works fine, as well as vkgears (using VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.i686.json:/usr/share/vulkan/icd.d/lvp_icd.x86_64.json vkgears).

I also have observed that forcing the use of amdvlk or RADV drivers removes llvmpipe from the GPU lists. While maybe expected (since llvmpipe is not an AMD GPU), it may be worth checking if such force loading is enabled.

However, no luck for vulkaninfo. I got the following:

# VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.i686.json:/usr/share/vulkan/icd.d/lvp_icd.x86_64.json vulkaninfo
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  1 (X_CreateWindow)
  Serial number of failed request:  7
  Current serial number in output stream:  8

Here is the output of vulkaninfo on a local X11 gnome session:

VP_VULKANINFOllvmpipe(LLVM_16_0_6,_256_bits)_0_0_1.json

I have tried to use GDB on vulkaninfo, but I don't have any knowledge on X11 so... Here is the backtrace, maybe it is useful, maybe not.

#0  __GI_exit (status=status@entry=1) at exit.c:140
#1  0x00007ffff7e4662c in _XDefaultError (event=<optimized out>, dpy=0x5555556b5cf0) at /usr/src/debug/libx11/libX11-1.8.7/src/XlibInt.c:1449
#2  _XDefaultError (dpy=0x5555556b5cf0, event=<optimized out>) at /usr/src/debug/libx11/libX11-1.8.7/src/XlibInt.c:1434
#3  0x00007ffff7e4674c in _XError (dpy=dpy@entry=0x5555556b5cf0, rep=rep@entry=0x5555556ad6c0) at /usr/src/debug/libx11/libX11-1.8.7/src/XlibInt.c:1503
#4  0x00007ffff7e46858 in handle_error (dpy=0x5555556b5cf0, err=0x5555556ad6c0, in_XReply=<optimized out>) at /usr/src/debug/libx11/libX11-1.8.7/src/xcb_io.c:211
#5  0x00007ffff7e46915 in handle_response (dpy=dpy@entry=0x5555556b5cf0, response=0x5555556ad6c0, in_XReply=in_XReply@entry=1) at /usr/src/debug/libx11/libX11-1.8.7/src/xcb_io.c:403
#6  0x00007ffff7e482fd in _XReply (dpy=dpy@entry=0x5555556b5cf0, rep=rep@entry=0x7fffffffd1c0, extra=extra@entry=0, discard=discard@entry=1) at /usr/src/debug/libx11/libX11-1.8.7/src/xcb_io.c:722
#7  0x00007ffff7e48691 in XSync (dpy=0x5555556b5cf0, discard=0) at /usr/src/debug/libx11/libX11-1.8.7/src/Sync.c:44
#8  0x0000555555567f70 in AppCreateXlibWindow (inst=...) at /usr/src/debug/vulkan-tools/Vulkan-Tools-1.3.269/vulkaninfo/./vulkaninfo.h:1010
#9  0x00005555555653ba in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/vulkan-tools/Vulkan-Tools-1.3.269/vulkaninfo/vulkaninfo.cpp:1154

I hope I provided useful information.

CendioOssman commented 8 months ago

Yeah, it's very unclear what vulkaninfo is upset about. Might be a general X11 thing and doesn't have anything to do with vulkan per se.

Have you reported the issue to the vulkaninfo developers? They are probably in a better position to understand what their tool needs.

ilylily commented 8 months ago

same issue as DrasLorus. virtualgl works, vulkan apps crash with X BadMatch and/or missing DRI3

vkcube under tigervnc with default (hardware) gpu selected:

Selected GPU 0: AMD Radeon RX 580 Series (RADV POLARIS10), type: DiscreteGpu
vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
Could not find both graphics and present queues

vulkaninfo, also under tigervnc:

vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  1 (X_CreateWindow)
  Serial number of failed request:  7
  Current serial number in output stream:  8

vkcube worked after installing mesa-vulkan-swrast and selecting gpu 1 instead of the default 0. i'm also using radv on amd, like the others

here's a minimal example copied from https://github.com/KhronosGroup/Vulkan-Tools/blob/sdk-1.3.261.1/vulkaninfo/vulkaninfo.h#L979

i call it crashable.c

#include <stdio.h>
#include <X11/Xutil.h>

int main() {
    const int width = 640;
    const int height = 480;

    long visualMask = VisualScreenMask;
    int numberOfVisuals;

    Display *xlib_display = XOpenDisplay(NULL);
    if (xlib_display == NULL) {
        printf("XLib failed to connect to the X server.\nExiting...\n");
        return 1;
    }

    XVisualInfo vInfoTemplate = {};
    vInfoTemplate.screen = DefaultScreen(xlib_display);
    XVisualInfo *visualInfo = XGetVisualInfo(xlib_display, visualMask, &vInfoTemplate, &numberOfVisuals);
    Window xlib_window = XCreateWindow(xlib_display, RootWindow(xlib_display, vInfoTemplate.screen), 0, 0, width,
                                     height, 0, visualInfo->depth, InputOutput, visualInfo->visual, 0, NULL);

    XSync(xlib_display, 0);
    XFree(visualInfo);

    printf("%p\n", (void *)xlib_window); // silence analyzer, prevent optimizing out our window
}

build with cc -o crashable crashable.c -lX11. it prints a pointer on :0, crashes on :1 (the tigervnc display). still crashes in the same place if we XSync before the XCreateWindow, so that's definitely the call that's doing it

honestly, i can't make heads nor tails of it. looks like a normal XCreateWindow to me. but then, it's bedtime. hopefully i'm missing something obvious

CendioOssman commented 8 months ago

Thanks. A minimal example is very helpful. It also clearly shows that the BadMatch error has nothing to do with Vulkan.

I'm guessing the issue is with the visual. That code is probably too simplistic and is not guaranteed to pick a useful one. We've seen issues like that before, where applications assume a certain order of visuals.

CendioOssman commented 8 months ago

It was indeed that bug. I thought we already fixed that ages ago, but apparently not.

With 7ad74d14160028fd709f595e9441c369cc4cd17e in place, vulkaninfo works just fine.

vkcube still doesn't pick the right GPU automatically, though. And to be honest, I wouldn't be fully sure if vulkan is even supposed to. Most systems will just have one GPU, so they might have figured it to be good enough that it picks the first "real" one it finds.

Need to dig more in to how vulkan enumerates and picks GPUs.

ilylily commented 8 months ago

nice! clean fix :)

it makes sense to use the software renderer only as fallback for hardware. i think the problem is it seems to be failing to fall back when a hardware device is present but not presentable. this is visible in vulkaninfo in tigervnc with 7ad74d14160028fd709f595e9441c369cc4cd17e applied - search GPU id, note devices under Layers vs ids under Presentable Surfaces. this may an issue for mesa, but may also be widespread improper device selection by vulkan applications? not my area of expertise

i was able to force vkcube to use llvmpipe on my alpine linux system with the env var VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json. the json file for the icd is provided by the mesa-vulkan-swrast package. if the radeon icd is in the list, before or after lvp, it fails to start and complains about dri3

it's worth noting that VirtualGL/virtualgl#37 indicates that nvidia's drivers enable hardware rendering in an x proxy. i'd be interested to know if this change improves compatibility for nvidia users, since everyone with the problem in this thread is using amd+mesa

DocMAX commented 8 months ago

glad this thread actually lead to something. but what about that DRI3 thing now? is there a switch at Xvnc like -dri3 to enable it or is DRI3 in general not working on "virtual" displays?

grafik

Also tried to enable with Xvnc +extension DRI3 ...

DocMAX commented 8 months ago

Just found another "version" of your server. Its here https://github.com/kasmtech/KasmVNC. I can launch the server like this: Xvnc -SecurityTypes=none -geometry 1280x720 -ac -listen tcp -nowebsocket -hw3d :20 But it seems it's a different protocol. The VNC client just disconnects with "unknown message type ...." messages. The -hw3d switch causes DRI3 to be activated. We need an implementation like this :-)

Edit: Just checked the source code it looks like a modified TigerVNC version to me. Are you aware of this? Here is the DRI3 implementation: https://github.com/kasmtech/KasmVNC/commit/d04982125a04962ca4a6d9829b0cdad5793db324

And there is more to read... https://github.com/TurboVNC/turbovnc/issues/373

CendioOssman commented 8 months ago

it makes sense to use the software renderer only as fallback for hardware. i think the problem is it seems to be failing to fall back when a hardware device is present but not presentable. this is visible in vulkaninfo in tigervnc with 7ad74d1 applied - search GPU id, note devices under Layers vs ids under Presentable Surfaces. this may an issue for mesa, but may also be widespread improper device selection by vulkan applications? not my area of expertise

Looking at the code for vkcube, it seems like it's up to each application to pick a sensible device. And vkcube simply picks the "fastest", without checking if it will actually work. Should probably file a bug with them for that.

Do you have a more "real" vulkan application we can test with and see how it behaves?

glad this thread actually lead to something. but what about that DRI3 thing now? is there a switch at Xvnc like -dri3 to enable it or is DRI3 in general not working on "virtual" displays?

DRI3 is a buffer sharing system, with some hardware handling thrown in to complicate things. It's not something we've implemented in TigerVNC yet.

Just found another "version" of your server. Its here https://github.com/kasmtech/KasmVNC.

Yeah, we are aware of them. Unfortunately, they aren't terribly active in actually working with us.

But it seems it's a different protocol. The VNC client just disconnects with "unknown message type ...." messages.

That doesn't surprise me. I don't think they have any intention of being compatible with VNC. Just with their fork of noVNC they include with the server.

The -hw3d switch causes DRI3 to be activated. We need an implementation like this :-)

I actually tested their patch, and as DRC also noticed, there are still some issues to be resolved. If someone feels up to it, then feel free to submit a PR once you have something that works. :)

euuurgh commented 7 months ago

Hi there! I'm having the exact same problem. I was trying to launch steam remotely on a lxc container on proxmox, but TigerVNC's X session does not support DRI3 and I'm also getting the error that zink does not work either.

I would really love to see this fixed, so if anyone needs more info about my system/setup, let me know, and we can debug

euuurgh commented 7 months ago

Ok, I have spent the last hours trying to somehow make my vision work, but here is the problem:

TigerVNC works great, but is not 3d accelerated Sunshine would probably have a incredible latency, but can not run headless. I tried very long and very hard to make both X11 and Wayland run headless, but nothing I tried also worked with sunshine

This is why I am once again posting here: 3d acceleration is a big part of modern computing, and I am actually shocked no normal VNC client supports Vulkan. If anyone has the skills to implement it, I would be eternally grateful

bphinz commented 7 months ago

Have you looked into TurboVNC?

Sent from Gmail Mobile

On Mon, Feb 12, 2024 at 6:30 PM euuurgh @.***> wrote:

Ok, I have spent the last hours trying to somehow make my vision work, but here is the problem:

TigerVNC works great, but is not 3d accelerated Sunshine would probably have a incredible latency, but can not run headless. I tried very long and very hard to make both X11 and Wayland run headless, but nothing I tried also worked with sunshine

This is why I am once again posting here: 3d acceleration is a big part of modern computing, and I am actually shocked no normal VNC client supports Vulkan. If anyone has the skills to implement it, I would be eternally grateful

— Reply to this email directly, view it on GitHub https://github.com/TigerVNC/tigervnc/issues/1674#issuecomment-1939780108, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB45M3NI7DC2KUUGFUGLYU3YTKQY5AVCNFSM6AAAAAA4W46HX2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZZG44DAMJQHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

CendioOssman commented 7 months ago

TurboVNC works the same as TigerVNC, so I'm afraid that will not improve things.

Perhaps you are thinking of VirtualGL, TurboVNC's sister project? That will allow you to accelerate OpenGL on both TigerVNC and TurboVNC. No idea about Vulkan support though. @dcommander?

CendioOssman commented 7 months ago

Anyway, the Vulkan issues seem to be application issues. Software Vulkan works fine for well-behaved applications. We've added a workaround for vulkaninfo's bug, and unfortunately vkcube needs to be fixed on its end.

3D acceleration is an entirely different beast, and we have #1626 for that. So I'm going to go ahead and close this issue as done.

dcommander commented 7 months ago

https://github.com/VirtualGL/virtualgl/issues/37 has more details. nVidia's Vulkan implementation does something VirtualGL-like if it detects that it is running in an X proxy, so it will be GPU-accelerated in TigerVNC. However, that implementation unfortunately doesn't allow you to select the GPU, the last time I checked. Other Vulkan drivers probably won't allow GPU acceleration in Xvnc at all. I spent a great deal of time trying to figure out a way to interpose Vulkan in the same way that VGL interposes GLX and EGL/X11. The main problem is that Vulkan interfaces with the X server at the driver level, not at the API level, so I can't interpose a few function calls and redirect rendering to another X display or to a DRI device, as VirtualGL does. It would be necessary to add VirtualGL-like functionality to an existing Vulkan driver, such as Mesa. I am happy to embark on that mission if someone forks over the labor costs, but those costs would be well into five figures in US dollars. The end goal would be to re-implement VirtualGL as an extension of Mesa so that VGL ships its own GLX, EGL, and Vulkan vendor libraries to provide the VGL front end, but the VGL back end would use GPU-specific vendor libraries. This is just a vision at this point, and I haven't explored the technical or licensing issues that might arise. Even if it is possible, it would undoubtedly be messy, and I question whether it is worth putting that much labor into X11 when the labor might be better spent figuring out how to do a Wayland VNC server. Theoretically, Wayland already has the ability to do GPU-accelerated remote display.

Referring to https://github.com/kasmtech/KasmVNC/issues/193, however, the main problem with Wayland from a VNC server's point of view is that there is no single Wayland compositor. You essentially have a different compositor depending on which window manager family you decide to use. Until there is more convergence around the set of Wayland extensions that a remote desktop server would need, any attempt at a Wayland remote desktop server would be tied to a specific family of compositors, such as Weston or wlroots or GNOME. Even if that weren’t the case, moving to Wayland raises questions regarding whether it is time to abandon RFB, a protocol designed around the limitations of 1980s machines just as X11 was designed around the limitations of those machines, and use a more modern protocol that has seamless window capabilities (which would eliminate the need to use a server-side window manager at all.) The remote desktop paradigm has always been clunky. What users really want is network transparency: to run applications remotely and have them behave as if they were local. Wayland has the technical infrastructure necessary to do that, with GPU acceleration, but it would take a great deal of work to make it happen. (I’m not even sure if the necessary remote display protocol even exists.) It may not even be feasible to get funding for all of that as an open source project. The aforementioned idea regarding GPU-accelerated Vulkan in Xvnc is a smaller project, but it may not be feasible to get funding for that either.

End of the day, this all ties back into the problem that, to most Linux infrastructure developers these days, remote display is either an afterthought or isn’t even considered at all. People seem to be acting as if Linux still has a chance to capture more than 3% of the desktop market, when the truth is that it’s a server O/S and needs to be treated like one. That means having remote desktop capabilities at least as good as Microsoft’s.

Probably more information than you wanted, but hopefully it goes a long way toward explaining why this isn’t a simple problem, even for someone who solved the same problem with OpenGL, and why a potential solution to this problem leads to a cascade of questions regarding how long the open source community can reasonably prop up 1980s display technologies.

dcommander commented 7 months ago

NOTE: I have a machine with both an AMD Radeon Pro WX2100 and an nVidia Quadro P620. If I select the Quadro in vkcube, everything works fine in Xvnc. If I select the Radeon Pro, I get the same complaint about the lack of a DRI3 extension. There doesn't seem to be a way to force it to use software Vulkan. (I tried the aforementioned VK_ICD_FILENAMES trick, but for some reason, that doesn't work on my system. It just complains: "Cannot find a compatible Vulkan installable client driver (ICD).")

Referring to the links in https://github.com/TigerVNC/tigervnc/issues/1674#issuecomment-1886163986, DRI3 in Xvnc is not an easy problem to solve. Kasm's implementation is easy enough to port into TigerVNC or TurboVNC, and in fact, I have done so in an experimental branch of TurboVNC that I haven't pushed to GitHub. The problem, however, is that it creates Pixmaps in system memory and synchronizes them with their associated GPU buffers on a schedule (60 times/second), rather than as needed, so it has a lot of overhead and doesn't perform very well when compared to VirtualGL. I am also skeptical as to whether that approach is fully conformant. (It seems like a mixed 3D/X11 rendering workload might break it.) I spent some time trying to figure out how to reduce the overhead and/or synchronize the GPU buffers as needed but wasn't able to. I was able to use the Xvnc X11 hooks to determine when to synchronize the pixels from a GBM buffer object into the corresponding DRI3-managed Pixmap, but those hooks were insufficient to determine when to synchronize the pixels from the Pixmap back into the corresponding BO. The BO is apparently read outside of X11, in the 3D driver, so it would probably be necessary to hook into the various 3D rendering APIs in order to perform that synchronization on an as-needed basis. At that point, the solution would look a lot like VirtualGL.

That effort led me to question why we couldn't just create the Xvnc framebuffer in GPU memory (tl;dr: you can't without an Xorg driver and the other infrastructure associated with a full-blown non-virtual X server), and answering that question led me down the same rabbit hole of questioning whether a Wayland VNC server would be a simpler solution to all of the above. The DRI3 feature would also be useless for nVidia GPU users, so if I did implement it, I would both have to apologize for it as well as disable it by default. It seems like the only purpose for it (that isn't already covered by VirtualGL) would be to get GPU-accelerated Vulkan in Xvnc with non-nVidia GPUs.

dcommander commented 6 months ago

I went ahead and cleaned up my experimental port of Kasm's DRI3 feature and pushed it into the dev branch of TurboVNC. It is in the 3.2 Evolving pre-release build if anyone wants to play with it. (You enable it by passing -drinode /dev/dri/renderDXXX to /opt/TurboVNC/bin/vncserver.) It definitely does allow GPU acceleration with Vulkan if you are using Mesa-based drivers, including AMDGPU. However, VirtualGL is still generally faster and has a better feature set for OpenGL applications, particularly professional applications.

DocMAX commented 3 months ago

Getting a compile error just about the new DRI3...: grafik

Strange, searching "repo:TurboVNC/turbovnc xvnc_dri3_sync_pixmaps_to_bos" in GitHub returns no results!?

CendioOssman commented 3 months ago

Issues compiling TurboVNC are probably best discussed in the TurboVNC forums.