m1k1o / neko

A self hosted virtual browser that runs in docker and uses WebRTC.
https://neko.m1k1o.net/
Apache License 2.0
5.96k stars 449 forks source link

WebGL not working for Nvidia Google Chrome 112.x #279

Open itguy327 opened 1 year ago

itguy327 commented 1 year ago

Can't use WebGL in Nvidia container using Google Chrome

m1k1o commented 1 year ago

Previous tag, ghcr.io/m1k1o/neko/nvidia-google-chrome:2.8.1 works. That was google chrome version 111.0.5563.146 and latest tag uses already chrome version 112.0.5615.49. Something they must have changed...

itguy327 commented 1 year ago

Thank you

m1k1o commented 1 year ago

I fixed current google chrome version, new tag v2.8.3 will be soon available so that :latest is not broken. I'll keep this issue open to track progress on this topic. Thanks for reporting

m1k1o commented 1 year ago

From the issue its clear, that this won't be worked on:

I'm done with this mess. If someone wants to pay for my time to track down all of this Chrome BS and file appropriate bug reports, then fine. Otherwise, it's not my job.

Therefore marking as wontfix. Meanwhile firefox works just right with vgl.

innerop commented 1 year ago

I know Chrome has soome issues in its GPU pipelines. A few times I reported breaking changes that stalled the video (that was copied to canvas) after they changed the code. The PM on the chromium issue was not the best person to interact with. They will fix it when one of their apps get affected or some user gets screwed and stays on top of them till they fix it. The developers usually try their best but the browser has become super complicated in terms of the code.

Btw, I'm using Edge now, and the decorations removal works, but the file download is a bit tricky since it selects its own Downloads folder and user has to pick the meko location.

Is Edge affected by this, too?

Could you please link to the issue?

m1k1o commented 1 year ago

Its linked earlier in this issue, here is the link again: https://github.com/VirtualGL/virtualgl/issues/229

If the change comes from Chromium then as soon as Edge and Brave update to this version, they won't work as well. But maybe it is only google chrome, hard for me to say.

innerop commented 1 year ago

Ok, there is also this: https://github.com/Xpra-org/xpra

innerop commented 1 year ago

I realized you might be using virtualgl to run the apps on the gpu inside docker, not for what I was thinking.

Have you considered using NVIDIA Docker?

m1k1o commented 1 year ago

It uses nvidia docker. This topic is fairly complex. You can read about it more here: https://github.com/selkies-project/docker-nvidia-egl-desktop that project was used as a base for EGL implementation in Neko.

innerop commented 1 year ago

The topic may be complex but the issue we have here is not: we have a faulty dependency whose author does not want to fix and I take it that it means we lose hardware acceleration via GPU, on all Chromium browsrers, and that would be a major regression. In theory, can the faulty dependency be replaced? If so, what are the options?

m1k1o commented 1 year ago

I completely understand the author, that they invested lots of their own time and resources to fix this issue once and in less than a month google basically completly broke it again. But I don't know about alternative.

mbattista commented 1 year ago

The topic may be complex but the issue we have here is not: we have a faulty dependency whose author does not want to fix and I take it that it means we lose hardware acceleration via GPU, on all Chromium browsrers, and that would be a major regression. In theory, can the faulty dependency be replaced? If so, what are the options?

Reading this makes me really sad. Instead of thinking how the issue can be resolved or how the maintainer can be helped, the first question is about dropping the dependency of a feature that only recently has been archived and was a step forward.

So what are the options:

innerop commented 1 year ago

You're absolutely right. I'm sorry my tone came out wrong.

I just add that this regression in Chrome coincided with Chrome making WebGPU supported without flags. I wonder if it happened in the same set of commit and whether there is some reason there to do with WebGPU support. I don't know much about the GL internals. I'm just a WebGL user, and GPU support is important not only for accelerated rendering but for running WebGL/WebGPU code.

innerop commented 1 year ago

Any way to run neko right now outside of Docker with WebGL/WebGPU support but without VirtualGL?

chrisprobst commented 8 months ago

@m1k1o Hey, first thanks for this interesting project. Really great to have something like that. I am no Chrome expert but I know low level rendering in depth. I have trouble understanding why VGL is required in the Docker context. NVIDIA provides egl Images that work fine. Is vgl used for something else? My goal is the run Chrome with webgpu Support (so Vulkan is critical for me, which is also weil supported by NVIDIA). So just curious and keen to learn about the issues.

m1k1o commented 8 months ago

@chrisprobst I am as well no chrome or GPU expert, I used this repo as an inspiration: https://github.com/selkies-project/docker-nvidia-egl-desktop

If you would be able to get it working (maybe somehow without VGL), that would be really great!

ehfd commented 8 months ago

@chrisprobst

Hi. I maintain the above container. The reason VirtualGL is used is that OpenGL has two main types of interfaces to paint to the screen; EGL and GLX. VirtualGL was used in the past to offload 3D OpenGL GLX acceleration to an Xorg server with loaded GPU drivers from another X11 server without 3D acceleration, typically virtual X11 servers without GPU acceleration like Xvnc or Xvfb (GLX to GLX).

While you are right that EGL is passed into the container through EGLStreams (/dev/nvidia*) or GBM (/dev/dri/card*), that does not mean that GLX is supported. Applications choose which OpenGL API to paint to the screen, and only modern applications use EGL, while the rest use GLX. GLX is only used in X11 while EGL can be also used in Wayland as well as X11 with EGL/X11.

GLX, in principle, requires an X11 server. What most people used to do is to pass /tmp/.X11-unix sockets from the host X11 server into the container. But this breaks container isolation. But from version 3.0, VirtualGL has an added functionality.

https://github.com/selkies-project/docker-nvidia-egl-desktop and https://github.com/selkies-project/docker-nvidia-glx-desktop are two different kinds of solutions to enable GLX inside isolated containers; one uses the EGL to GLX translator introduced newly in VirtualGL 3.0, allowing virtual X servers to support GLX through EGL without an X11 server loaded with NVIDIA drivers, and the other adds required NVIDIA driver libraries inside the containers and uses workarounds to run a full Xorg server inside a container.

So, to answer your question, VirtualGL exists to support GLX without a full Xorg server, and you are correct that you may directly use EGL instead of GLX for Chrome through ANGLE. Thus, VirtualGL doesn't have to be a strict requirement.

Vulkan is supported without additional intervention as long as NVIDIA_DRIVER_CAPABILITIES includes display or is set to all, and the correct Vulkan ICD file is inside the container.

There is, however, no reliable open-source way to transport raw Vulkan commands over the network to different hosts, which was what VirtualGL was meant to do for GLX.

chrisprobst commented 8 months ago

@m1k1o @ehfd Thank you very much for these detailed background information. I had to read it multiple times, I think I got it now. This EGL GLX distinction was indeed new to me. This explains the use of VGL previously and also suggests that Chrome (angle on egl) might work without it. I believe there is even angle = Vulkan nowadays which makes me believe that maybe even egl might not be necessary. I'll dig a bit in the next weeks. When I find something I'll let you know. Luckily, it overlaps a bit with my day work currently so I might be able to spend some time on it during work. Cheers!

ehfd commented 8 months ago

Thanks for your interest in this problem. It seems that provisioning /dev/shm is pretty important as well.

@chrisprobst

chrisprobst commented 8 months ago

@ehfd Thanks! I will take this into account. I believe you can disable shm use on Chrome Linux but maybe it has bad side effects, so I try to enable it. I might also reach out here if I hit walls

m1k1o commented 1 week ago

Google do not store pre-112 versions on the cloud anymore. So the latest build is with latest chrome, therefore not working. Afaik it's still not fixed.