cemu-project / Cemu

Cemu - Wii U emulator
https://cemu.info
Mozilla Public License 2.0
6.61k stars 505 forks source link

Vulkan: remove present fence and PreparePresentationFrame #1166

Closed goeiecool9999 closed 2 months ago

goeiecool9999 commented 2 months ago

Removing PreparePresentationFrame: PreparePresentationFrame was a hacky workaround because the VulkanRenderer did not adhere to the same interface as the OpenGL renderer. It became redundant at some point during the swapchain refactors. Everything acquires it's own image or reuses a previously acquired image now so there's no need to depart from the generic interface anymore.

Removing the fence: I was reading this article from 2022 where at some point it mentions that on Intel ANV we run into problems when implementing vkWaitForFences() for the fence from vkAcquireNextImageKHR(). That has to be done via DRM_IOCTL_I915_GEM_WAIT which can't tell the difference between the compositor's work and work which has since been submitted by the client. If you call vkWaitForFences() on such a fence after submitting any client work, it basically ends up being a vkDeviceWaitIdle() which isn't at all what you want. This issue has likely been fixed on the driver-side at this point, however it's probably a good idea to get rid of the fence wait anyway in case any drivers run into the same issue. If I understand correctly the reason why vkAcquireNextImageKHR provides the ability to wait for a fence is so applications can synchronise access to the image from the CPU. Since cemu doesn't touch the swapchain image with the CPU the fence is unnecessary. Previously I tried removing the semaphores and keeping the fence but this caused issues on moltenVK. I have tested this version without the fence on macOS and there are no issues. In theory the fence and the semaphore synchronise to the same event (image released by presentation engine), so submitting a command buffer after waiting for the fence would also require that the commands in that command buffer execute after the image is actually released by the presentation engine unless the GPU can time travel. It's not surprising that synchronising this way causes a bug though, since normally fences aren't used to synchronize work on the GPU.

I've also adjusted the timeout value to 1 second so that even with bad drivers the thread will never lock up completely.

Exzap commented 2 months ago

Looks good to me. Thanks!