Open belegdol opened 2 months ago
The issue with that protocol error on Vulkan is a fix we have internally which will be in a future release. I can't reproduce any issues on GL native wayland or Xwayland (GL and Vulkan) however.
The issue with that protocol error on Vulkan is a fix we have internally which will be in a future release.
Great to hear, thanks for looking into this!
I can't reproduce any issues on GL native wayland or Xwayland (GL and Vulkan) however.
Are you saying that the examples app starts normally? May I ask how are you starting it? On Fedora, SDL wayland videodriver is the default, so my exact command line is:
WAYLAND_DEBUG=1 SDL_VIDEODRIVER=x11 ../../.build/linux64_gcc/bin/examplesDebug --gl
WAYLAND_DEBUG=1 SDL_VIDEODRIVER=x11 ../../.build/linux64_gcc/bin/examplesDebug
WAYLAND_DEBUG=1 SDL_VIDEODRIVER=wayland ../../.build/linux64_gcc/bin/examplesDebug --gl
WAYLAND_DEBUG=1 SDL_VIDEODRIVER=wayland ../../.build/linux64_gcc/bin/examplesDebug
With wayland and GL, it appears that the examples app does not really crash, but rather shuts down without displaying anything. I have now re-tested with bgfx e4641029, egl-wayland-1.1.17-2.20240828git2d5ecff, egl-x11-0.1-1.20240828git2be2296 and nvidia driver from RPM Fusion master using system egl-x11, and I can still reproduce the problem(s).
My GPU is an RTX 2070 and I am running the proprietary kernel module in case this matters.
Those are pretty much the exact commands I used, except my build ended up being a release build so the binary was examplesRelease
. I saw the window start, said it was the hello world demo, let me pick things and play around with the GPU stats, etc. Seemed like it was working fine.
I am testing with the debug build. I will try release later and report back.
I also get crashes using the release build. Strange. Could this be GPU-specific? Or caused by using the proprietary kernel driver?
Here is my eglinfo output:
$ eglinfo -B
GBM platform:
EGL API version: 1.5
EGL vendor string: NVIDIA
EGL version string: 1.5
EGL client APIs: OpenGL_ES OpenGL
OpenGL core profile vendor: NVIDIA Corporation
OpenGL core profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL core profile version: 4.6.0 NVIDIA 560.35.03
OpenGL core profile shading language version: 4.60 NVIDIA
OpenGL compatibility profile vendor: NVIDIA Corporation
OpenGL compatibility profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL compatibility profile version: 4.6.0 NVIDIA 560.35.03
OpenGL compatibility profile shading language version: 4.60 NVIDIA
OpenGL ES profile vendor: NVIDIA Corporation
OpenGL ES profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL ES profile version: OpenGL ES 3.2 NVIDIA 560.35.03
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
Wayland platform:
EGL API version: 1.5
EGL vendor string: NVIDIA
EGL version string: 1.5
EGL client APIs: OpenGL_ES OpenGL
OpenGL core profile vendor: NVIDIA Corporation
OpenGL core profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL core profile version: 4.6.0 NVIDIA 560.35.03
OpenGL core profile shading language version: 4.60 NVIDIA
OpenGL compatibility profile vendor: NVIDIA Corporation
OpenGL compatibility profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL compatibility profile version: 4.6.0 NVIDIA 560.35.03
OpenGL compatibility profile shading language version: 4.60 NVIDIA
OpenGL ES profile vendor: NVIDIA Corporation
OpenGL ES profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL ES profile version: OpenGL ES 3.2 NVIDIA 560.35.03
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
X11 platform:
EGL API version: 1.5
EGL vendor string: NVIDIA
EGL version string: 1.5
EGL client APIs: OpenGL_ES OpenGL
OpenGL core profile vendor: NVIDIA Corporation
OpenGL core profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL core profile version: 4.6.0 NVIDIA 560.35.03
OpenGL core profile shading language version: 4.60 NVIDIA
OpenGL compatibility profile vendor: NVIDIA Corporation
OpenGL compatibility profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL compatibility profile version: 4.6.0 NVIDIA 560.35.03
OpenGL compatibility profile shading language version: 4.60 NVIDIA
OpenGL ES profile vendor: NVIDIA Corporation
OpenGL ES profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL ES profile version: OpenGL ES 3.2 NVIDIA 560.35.03
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
Surfaceless platform:
EGL API version: 1.5
EGL vendor string: NVIDIA
EGL version string: 1.5
EGL client APIs: OpenGL_ES OpenGL
OpenGL core profile vendor: NVIDIA Corporation
OpenGL core profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL core profile version: 4.6.0 NVIDIA 560.35.03
OpenGL core profile shading language version: 4.60 NVIDIA
OpenGL compatibility profile vendor: NVIDIA Corporation
OpenGL compatibility profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL compatibility profile version: 4.6.0 NVIDIA 560.35.03
OpenGL compatibility profile shading language version: 4.60 NVIDIA
OpenGL ES profile vendor: NVIDIA Corporation
OpenGL ES profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL ES profile version: OpenGL ES 3.2 NVIDIA 560.35.03
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
Device platform:
Device #0:
Platform Device platform:
EGL API version: 1.5
EGL vendor string: NVIDIA
EGL version string: 1.5
EGL client APIs: OpenGL_ES OpenGL
OpenGL core profile vendor: NVIDIA Corporation
OpenGL core profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL core profile version: 4.6.0 NVIDIA 560.35.03
OpenGL core profile shading language version: 4.60 NVIDIA
OpenGL compatibility profile vendor: NVIDIA Corporation
OpenGL compatibility profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL compatibility profile version: 4.6.0 NVIDIA 560.35.03
OpenGL compatibility profile shading language version: 4.60 NVIDIA
OpenGL ES profile vendor: NVIDIA Corporation
OpenGL ES profile renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
OpenGL ES profile version: OpenGL ES 3.2 NVIDIA 560.35.03
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
Device #1:
Platform Device platform:
libEGL warning: egl: failed to create dri2 screen
libEGL warning: egl: failed to create dri2 screen
libEGL warning: egl: failed to create dri2 screen
eglinfo: eglInitialize failed
Device #2:
Platform Device platform:
EGL API version: 1.5
EGL vendor string: Mesa Project
EGL version string: 1.5
EGL client APIs: OpenGL OpenGL_ES
OpenGL core profile vendor: Mesa
OpenGL core profile renderer: llvmpipe (LLVM 18.1.6, 256 bits)
OpenGL core profile version: 4.5 (Core Profile) Mesa 24.1.7
OpenGL core profile shading language version: 4.50
OpenGL compatibility profile vendor: Mesa
OpenGL compatibility profile renderer: llvmpipe (LLVM 18.1.6, 256 bits)
OpenGL compatibility profile version: 4.5 (Compatibility Profile) Mesa 24.1.7
OpenGL compatibility profile shading language version: 4.50
OpenGL ES profile vendor: Mesa
OpenGL ES profile renderer: llvmpipe (LLVM 18.1.6, 256 bits)
OpenGL ES profile version: OpenGL ES 3.2 Mesa 24.1.7
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
I also get crashes using the release build. Strange. Could this be GPU-specific? Or caused by using the proprietary kernel driver?
And you don't get crashes with the debug build? If yes, this usually points to a race condition depending on tight timing.
I get crashes in both. I was referring to @amshafer not seeing issues with GL on wayland or any crashes with xwayland.
The issue with that protocol error on Vulkan is a fix we have internally which will be in a future release. I can't reproduce any issues on GL native wayland or Xwayland (GL and Vulkan) however.
Is this the protocol error you are referring to: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/211#note_2576891 ?
565.57.01 does not fix the issue unfortunately.
There have been some Wayland fixes in bgfx recently. With current git (https://github.com/bkaradzic/bgfx/commit/4bc652939ff400e424e17185d23b229a37d269e1) and 565.57.01 nvidia driver there is an improvement:
__NV_DISABLE_EXPLICIT_SYNC=1
__NV_DISABLE_EXPLICIT_SYNC=1
defined and shuts down without it__NV_DISABLE_EXPLICIT_SYNC=1
I will be posting logs in separate comments, but DRM Syncobj surface object already created for surface 42")
appears to be the common issue among the different configurations.
Are you sure bgfx is calling eglDestroySurface properly? I don't see any calls to wp_linux_drm_syncobj_surface_v1_destroy
here (which we do in wlEglDestroySurface
in egl-wayland to clean up) so it doesn't seem to be cleaning any surfaces up. If it creates a second EGLSurface for that wl_surface without destroying the first EGLSurface it would cause a bug like this.
That would also explain why this still happens on 565 with Vulkan, since the fix I mentioned previously is in that release.
I have relayed your comment to bgfx developers as I cannot answer the question myself. I will report back if I get an answer.
Are you sure bgfx is calling eglDestroySurface properly? I don't see any calls to
wp_linux_drm_syncobj_surface_v1_destroy
here (which we do inwlEglDestroySurface
in egl-wayland to clean up) so it doesn't seem to be cleaning any surfaces up. If it creates a second EGLSurface for that wl_surface without destroying the first EGLSurface it would cause a bug like this.That would also explain why this still happens on 565 with Vulkan, since the fix I mentioned previously is in that release.
If egl-wayland even allows more than one EGLSurface to be created at the same time form the same wl_surface, then that would be a bug in and of itself. From the EGL spec:
If there is already an EGLSurface associated with native window (as a result of a previous eglCreatePlatformWindowSurface call), then an EGL_BAD_ALLOC error is generated.
I think it does check for that, though, since it checks and assigns to the wl_egl_window::driver_private
pointer.
I think it does check for that, though, since it checks and assigns to the
wl_egl_window::driver_private
pointer.
Now that I think about it, checking for a duplicate wl_egl_window
isn't enough: An app could call wl_egl_window_create
more than once, which means it could have more than one wl_egl_window
that point to the same wl_surface
.
Posting here since there is a lot of other traffic on nvidia forums [0].
On Fedora 40 x86_64 with nvidia driver 560.35.03 and egl-wayland 1.1.16, attempting to start bgfx examples under wayland crashes regardless of the renderer or the SDL videodriver. In order to reproduce:
cd examples/runtime
../../.build/linux64_gcc/bin/examplesDebug
Defining
__NV_DISABLE_EXPLICIT_SYNC=1
allows both Vulkan and OpenGL. Using x11 SDL videodriver is also not working, but likely due to different reasons [2]. I have also reported this to bgfx [3] given that the recent explicit-sync-related firefox crashes required fixes both in firefox and in egl-wayland.With
WAYLAND_DEBUG=1
the following error can be observed:[0] https://forums.developer.nvidia.com/t/explicit-sync-causes-bgfx-examples-to-crash-when-running-in-wayland-session/304484 [1] Building — bgfx 1.127.8709 documentation 3 [2] https://forums.developer.nvidia.com/t/hardware-egl-not-working-on-wayland-libegl-warning-egl-failed-to-create-dri2-screen/262167 [3] Crashes on Wayland with nvidia driver 560.35.03 · Issue #3342 · bkaradzic/bgfx · GitHub