NVIDIA / egl-wayland

The EGLStream-based Wayland external platform
MIT License
293 stars 47 forks source link

[BUG] Ver 1.1.8-1 breaks some GTK3/4 apps (Gnome ToDo, Gnome Maps, Cheese) - On Hybrid Graphics configurations #41

Closed brogers-propstream closed 2 years ago

brogers-propstream commented 3 years ago

Expected behavior:

Launching Gnome ToDo, Gnome Maps, or Cheese functions and the app is displayed.

Actual behavior:

The apps are never launched

Debug Logs:

[3430410.735] wl_display@1.delete_id(36)
[3430410.749] wl_display@1.delete_id(37)
[3430410.753] wl_display@1.error(nil, 7, "failed to import supplied dmabufs: Arguments are inconsistent (for example, a valid context requires buffers not supplied by a ")
13:01:41.382673                           Gdk:    DEBUG: [destroyed object]: error 7: failed to import supplied dmabufs: Arguments are inconsistent (for example, a valid context requires buffers not supplied by a 

Gdk-Message: 13:01:41.382: Error flushing display: Protocol error

Workaround:

Rolling back to ver 1.1.7-1 resolved the issue entirely.

cubanismo commented 3 years ago

Which GPU(s) were used in this testing?

brogers-propstream commented 3 years ago

@cubanismo - Ah should've remembered that, my apologies.

I'm running a PRIME GPU setup.

Primary is my Ryzen 9 5900HS CPU's Radeon graphics. Secondary is my Nvidia 3070 Max-Q GPU.

I'm launching all apps in "non-prime" mode - meaning they should be using the integrated Radeon GPU.

However, running in "prime" mode didn't change anything either. Still fails to launch with an identical error.

My OS is EndeavorOS/Arch and I'm fully up-to-date as of a few minutes ago.

brogers-propstream commented 3 years ago

Per this reply - https://github.com/NVIDIA/egl-wayland/issues/40#issuecomment-930539832

I updated the description to suggest that this is specific to hybrid graphics.

erik-kz commented 3 years ago

For the non-prime case, I wonder if it's still trying to use our EGL library instead of mesa's. Could you try running one of those applications with the environment variable __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json

adamant-pwn commented 3 years ago

My setup is

egl-wayland version: 1.1.8-1
CPU: AMD Ryzen 7 5800H with Radeon Graphics (16) @ 3.200GHz 
GPU: NVIDIA GeForce RTX 3070 Mobile / Max-Q
GPU driver: NVIDIA 470.74

I had a similar problem and opened this issue on GNOME-gtk. Using environment variable __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json fixed everything.

cubanismo commented 3 years ago

I have a decent guess as to why this regressed now. Previously the egl-wayland code would only create displays for compositors that supported the EGLStreams wayland protocol extensions. In practice, this would fail for non-NV-driven wayland compositors and GLVND would presumably fall back to creating an EGLDisplay on some Mesa driver on such compositors. Now the code thinks it can render to any dmabuf-capable compositor, but in practice it only supports NV device-local buffers using NV-specific tiling formats, so happily reports support for the compositor but fails to present to it. Couple things to fix here:

-As of version 1.1.9, egl-wayland supports the wl_drm protocol on the compositor side, so we should be able to query the compositor for its DRM device and fail the display creation if binding that protocol fails (Non-NV device on the compositor side in that case) or if the device returned isn't an NV device.

-In the driver code, we should better handle surface creation such that it works in PRIME configs by creating non-device-local linear buffers that other devices can import if we don't see compatible tiled DRM format modifiers. Also, even if there are compatible format modifiers, but the device doesn't match (egl-wayland always runs on the first EGLDevice. It should probably use some better heuristic, e.g., pick the wl_drm-suggested device or use whatever comes out of https://gitlab.freedesktop.org/wayland/wayland/-/issues/59/https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/8, but device mismatches can still occur), a tiled non-device-local buffer should be used. Unfortunately, there's not a great way to express the latter in the EGLStream EGLImage consumer/surface producer API right now, nor in the GBM API (Need a buffer constraints mechanism), so it might mean falling back to linear even for NV->NV GPU cases for the time being.

cubanismo commented 3 years ago

I'll file internal NV bugs 3391989 (OSS egl-wayland fixes for the basic regression) and 3391982 (Actual cross-GPU/PRIME support in internal components) to track fixing this issue.

erik-kz commented 2 years ago

The regression for the non-PRIME use-case should be fixed by https://github.com/NVIDIA/egl-wayland/commit/d4937adc5cd04ac7df98fc5616e40319fb52fdee

As James mentioned, support for the PRIME use-case will require driver-side changes. We're working on this, but can't provide an ETA yet.

ReillyBrogan commented 2 years ago

@erik-kz I just tried to compile master to test that commit and got that wayland-drm-client-protocol.h was missing. Was this perhaps omitted from the push or is this an additional external lib that needs to be pulled in/updated?

bertin0 commented 2 years ago

I just tried the changes on master (commit daab854) and I still need __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json for alacritty to work. I think it fixed running firefox through xwayland though, so maybe there is some issue with alacritty being written in rust too.

myyc commented 2 years ago

sorry to piggy back on this but it's also a problem with my CPU with non-integrated graphics (ryzen 5 5600x, nvidia 3060). the __EGL_VENDOR_LIBRARY_FILENAMES workaround fixes the affected apps (which for some reason aren't the same as the reporter's).

dylanmtaylor commented 2 years ago

The __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json workaround is needed for me when launching gnome-boxes or amazon-workspaces.

VarLad commented 2 years ago

One question Considering that they'll fix stuff on the driver side of things Would that be released as a minor driver update? or would it be another major driver release after a few months?

cc: @cubanismo @erik-kz

cubanismo commented 2 years ago

As Erik mentioned above, we don't have any estimates for the driver-side fixes at the moment.

amardhruva commented 2 years ago

also breaks mpv on ubuntu 22.04 beta. Hope the fixed driver is released soon.

amardhruva commented 2 years ago

Hey @cubanismo Is there any update on the driver side fix?

aisivan36 commented 2 years ago

Gnome totem video player is broken and Cheese as well as gnome extensions cannot be opened on Optimus Wayland any fix of this?

gzmorell commented 2 years ago

Version 1.1.9-1.1 on ubuntu 22.04 still has this problem. gnome-maps:

[2472783.925] wl_display@1.error(nil, 7, "failed to import supplied dmabufs: Arguments are inconsistent (for example, a valid context requires buffers not supplied by a ")
[2472783.901] wl_display@1.delete_id(56)
Gdk-Message: 09:55:07.113: Error 71 (Protocol error) dispatching to Wayland display.

cheese:

`2641326.949] wl_display@1.delete_id(51)
[2641326.981] wl_buffer@49.release()
[2641327.131] wl_display@1.error(nil, 7, "failed to import supplied dmabufs: Arguments are inconsistent (for example, a valid context requires buffers not supplied by a ")
Gdk-Message: 09:57:55.654: Error reading events from display: Protocol error`

gnome-todo works without error.

bellini666 commented 2 years ago

This issue is also present in Debian Sid as of writing this comment. Had this issue running gnome-console (kgx).

Running it with __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json kgx fixes everything.

davuses commented 2 years ago

Can confirm in Ubuntu22, the solution is adding variable __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json

erik-kz commented 2 years ago

This issue is fixed in version 1.1.10 and later. For earlier versions, yes, I suppose setting __EGL_VENDOR_LIBRARY_FILENAMES is the easiest workaround.

mirao commented 1 year ago

The workaround works e.g. for app cheese. But can I apply it somehow when I start settings of any Gnome Shell extension (e.g. Clipboard Indicator)?

gnome-shell[5189]: WL: error in client communication (pid 11332)
gjs[11332]: Error reading events from display: Protocol error

It doesn't matter if I start it over desktop Settings or over web settings, it fails with the mentioned error. I'm using Ubuntu 22.04, libnvidia-egl-wayland1 1.1.9-1.1

Schneegans commented 1 year ago

I think you should be able to open the extensions app with this:

export __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json
gjs /usr/share/gnome-shell/org.gnome.Shell.Extensions & gnome-extensions-app
mirao commented 1 year ago

@Schneegans Works well, thank you very much.

TheComputerGuy96 commented 1 year ago

This error still happens with Vulkan applications:

$ prime-run vkcube-wayland 
Selected GPU 0: NVIDIA GeForce GTX 1650 Ti, type: DiscreteGpu 
[destroyed object]: error 7: failed to import supplied dmabufs: Arguments are inconsistent (for example, a valid context requires buffers not supplied by a