NVIDIA / egl-wayland

The EGLStream-based Wayland external platform
MIT License
293 stars 47 forks source link

EGL_NO_SURFACE can corrupt a context in EGL_OPENGL_API mode #48

Closed kkartaltepe closed 5 months ago

kkartaltepe commented 2 years ago

I was looking into mpv's failure to render on the egl wayland backend and after some back and forth it appears this is a driver bug. Since this project implements that backend I presume this is the right place to report it, if not feel free to redirect me elsewhere.

On the 495 driver with egl-wayland 1.1.8 and later (including master) the nvidia drivers can enter a state where eglMakeCurrent with EGL_NO_SURFACE can render a context unsuable for wayland windows (they almost always render nothing, 1/100 times they will render unintialized vram, and 1/100 times they will render the surface you made current). The motivating example of this bug is mpv --no-config --opengl-es=no --vo=gpu --gpu-context=wayland big_buck_bunny.mp4 which fails to display on gnome-shell, kwin_wayland, and sway. However switching the rendering api to gles via --opengl-es=yes results in a perfectly displayed window.

However mpv is a large program and also happens to have some other exciting bugs, so we have distilled a minimal example of this error. You can find the source at https://github.com/kkartaltepe/wayland-egl-simple/tree/mpv-example this is a minimal window rendered with with egl which displays the same issues as mpv. The comments in main.c explain how to compile it and you can run it with ./demo to observe the desktop gl issue and ./demo gles to see how changing the bound api results in different behaviour. The bug appear to be triggered by https://github.com/kkartaltepe/wayland-egl-simple/blob/mpv-example/main.c#L229

Personally I have been testing this with nested compositors. From an X11 session you can run gnome (41.1) via gnome-shell --nested --wayland or for kde (5.23.3) kwin_wayland --no-lockscreen. Prior to version 1.1.8 eglCreatePlatformWindowSurface fails for this minimal demo.

erik-kz commented 2 years ago

MPV is the third application in the past few months to run into this issue. To be brief, it's an unresolved ambiguity in the EGL spec what the GL_DRAW_BUFFER should be after making EGL_NO_SURFACE current to a context and then later making a real surface current to the same context. With NVIDIA, the answer is GL_NONE which differs from Mesa. Note that this is only for OpenGL, the behavior is well-defined for OpenGL ES.

Both Firefox https://hg.mozilla.org/mozilla-central/rev/c2191ee9cb65 Kwin https://invent.kde.org/plasma/kwin/-/commit/d257850bd1815aae4d985fb0e24f8f10851c42da have implemented work-arounds, but I'm starting to think we should try to get this cleared up properly. However, if you do want a quick fix, something similar to those two patches would probably be the easiest option.

cubanismo commented 2 years ago

While there's been much debate about this, I don't believe the spec is actually ambiguous. I think the Mesa behavior actually deviates from the spec, while the NV behavior is technically correct. See the analysis in issues 3 and 4 in the below extension specification for pointers to relevant core specification language:

https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_no_config_context.txt

kkartaltepe commented 2 years ago

I see, thank you for the the informative response. I have updated my sample with an X11 demo that appears to demonstrate the same behavior. So it seems the nvidia driver is consistent across X11 and wayland in this regard, but mpv (and other applications) appear less likely to make current with no surface in X11.

Tank-Missile commented 2 years ago

Is this bug relevant to certain games rendering completely black such as Team Fortress 2, which uses OpenGL? I can't find any information on certain xwayland apps rendering black on the NVIDIA proprietary driver anywhere else. I want to make sure I'm on the right track to finding a solution, or helping to create one.

erik-kz commented 5 months ago

Is this bug relevant to certain games rendering completely black such as Team Fortress 2, which uses OpenGL? I can't find any information on certain xwayland apps rendering black on the NVIDIA proprietary driver anywhere else. I want to make sure I'm on the right track to finding a solution, or helping to create one.

Probably not, the source engine uses GLX which is not affected by this.