NVIDIA / egl-wayland

The EGLStream-based Wayland external platform
MIT License
275 stars 44 forks source link

Implement Explicit Sync #104

Closed amshafer closed 3 months ago

amshafer commented 4 months ago

This implements the explicit sync linux-drm-syncobj-v1 protocol for EGL.

This pull request exists to publicize our EGL implementation of explicit sync that we have been using to test the Mutter and Xwayland implementations. Erik and I have done a good amount of testing with this so it's in a fairly polished state and isn't just a first prototype.

Currently the code as proposed here will not build or run on existing drivers. To build it would need an updated wayland-protocols that contains linux-drm-syncobj, and to run it would require a newer (next upcoming release) NVIDIA driver that has the necessary prerequisites. Whenever this lands the minimum required NVIDIA driver version will also be bumped.

This change finally does away with the background threads which have given us issues in the past. Both the damage and the buffer release threads are disabled when explicit sync is supported.


Most of this change involves wayland-protocol handling boilerplate. The protocol works by allowing the creation of timelines (i.e. DRM syncobjs) and per-surface states. We can then specify acquire and release points during our surface state configuration which will tell the compositor when it can access the buffer or notify us when it is finished accessing the buffer.

Sync point signaling takes place during acquire_surface_image. We choose our two release point values and send them to the compositor. In the acquire case, the EGLSync will be created first and will be populated with a fence representing the GPU work. We can extract its syncfd and import it at the acquire point. We choose the release point during image acquisition, but don't actually create an EGLSync for it until later when the compositor has added a fence to the release point. We check if this has taken place after every surface commit.

Separate timelines are used for the acquire and release points. Each stream image will have its own timeline, otherwise our signaling of acquire points would implicitly signal the still-pending release points.

konradmoesch commented 1 month ago

The beta driver (555.42.02) was released today: https://www.nvidia.com/download/driverResults.aspx/224751/en-us/

cwegener commented 1 month ago

Please note that additional sources of "flickering" in XWayland were addressed as part of the XWayland 24.1 release. No NVidia driver updates needed for that one.

ptr1337 commented 1 month ago

Hi,

Actually user reporting that this patch together with the 555 NVIDIA Driver does cause crashes in several applications, when using in KDE.

Sadly, the user did not provide any special log yet, but following I could see in the journalctl:

May 21 22:11:24 tsuru plasmashell[1756]: [GFX1-]: Wayland protocol error: wp_linux_drm_syncobj_surface_v1@64: error 4: explicit sync is used, but no acquire point is set
May 21 22:11:24 tsuru plasmashell[1756]: ExceptionHandler::GenerateDump cloned child 1947
May 21 22:11:24 tsuru plasmashell[1756]: ExceptionHandler::SendContinueSignalToChild sent continue signal to child
May 21 22:11:24 tsuru plasmashell[1947]: ExceptionHandler::WaitForContinueSignal waiting for continue signal...
May 21 22:11:24 tsuru plasmashell[1861]: Exiting due to channel error.

Full log: https://paste.cachyos.org/p/5e93d55.log

Downgrading the egl-wayland package to an unpatched one, does fix the issue.

Edit: Also, there are bunch of firefox crashes due this patchset.

mrdev023 commented 1 month ago

Explicit sync is used ? 🤔 Is strange because is available for plasma 6.1 in June. Not now, if user want use it now, it require https://invent.kde.org/plasma/kwin/-/merge_requests/5511 this

ptr1337 commented 1 month ago

Explicit sync is used ? 🤔 Is strange because is available for plasma 6.1 in June. Not now, if user want use it now, it require https://invent.kde.org/plasma/kwin/-/merge_requests/5511 this

Yes, we patch it in our kwin. There is a backport patch provided by KDE for plasma 6.0.4

jthoward64 commented 1 month ago

I get that this is a great spot to get a lot of eyes on an issue, but this PR is closed and now shipping. I'm pretty sure there are also a LOT of people subscribed here. Any further discussion should be a new issue or on the forums.

P.S. The KWin patch is not needed for explicit sync support, just for optimal performance and behavior.

valtapaz commented 1 month ago

I get that this is a great spot to get a lot of eyes on an issue, but this PR is closed and now shipping. I'm pretty sure there are also a LOT of people subscribed here. Any further discussion should be a new issue or on the forums.

I agree; might I ask if there's a proper thread on the forums for this? I can see the GRD 555.85 feedback thread, but that doesn't look like the same thing. I think most people (myself included) are just posting here because they don't know where else to look

EDIT: found the discussion thread on the developer forums https://forums.developer.nvidia.com/t/555-release-feedback-discussion/293652