KhronosGroup / MoltenVK

MoltenVK is a Vulkan Portability implementation. It layers a subset of the high-performance, industry-standard Vulkan graphics and compute API over Apple's Metal graphics framework, enabling Vulkan applications to run on macOS, iOS and tvOS.
Apache License 2.0
4.76k stars 419 forks source link

Issues with the swap chain in 1.3.275.0 and later versions #2226

Open TheMostDiligent opened 5 months ago

TheMostDiligent commented 5 months ago

Hello!

After updating Vulkan SDK to the latest version, we started seeing issues with the swap chain that did not happen before (versions 1.3.268.1 and earlier).

First, when running application from XCode with Metal API Validation enabled, the app triggers Metal assertion:

validateRenderPassDescriptor:991: failed assertion `RenderPass Descriptor Validation
renderTargetWidth (1600) must be <= minimum attachment width (1).
renderTargetHeight (1200) must be <= minimum attachment height (1).

When disabling Metal API Validation, Vulkan validation displays errors:

Diligent Engine: ERROR: Vulkan debug message (validation): UNASSIGNED-VkPresentInfoKHR-pImageIndices-MissingAcquireWait
                 Validation Error: [ UNASSIGNED-VkPresentInfoKHR-pImageIndices-MissingAcquireWait ] Object 0: handle = 0x808562000000003f, type = VK_OBJECT_TYPE_IMAGE; Object 1: handle = 0x9f9b41000000003c, name = Swap chain image acquired semaphore 2, type = VK_OBJECT_TYPE_SEMAPHORE; Object 2: handle = 0x5c5283000000003e, type = VK_OBJECT_TYPE_FENCE; | MessageID = 0x1b6b9ef2 | vkQueuePresentKHR(): pPresentInfo->pImageIndices[0] was acquired with a semaphore VkSemaphore 0x9f9b41000000003c[Swap chain image acquired semaphore 2] and fence VkFence 0x5c5283000000003e[] and neither of them have since been waited on
                 Object[0] (image): Handle 0x808562000000003f
                 Object[1] (semaphore): Handle 0x9f9b41000000003c, Name: 'Swap chain image acquired semaphore 2'
                 Object[2] (fence): Handle 0x5c5283000000003e

Diligent Engine: ERROR: Vulkan debug message (validation): VUID-vkDestroyFence-fence-01120
                 Validation Error: [ VUID-vkDestroyFence-fence-01120 ] Object 0: handle = 0x5c5283000000003e, type = VK_OBJECT_TYPE_FENCE; | MessageID = 0x5d296248 | vkDestroyFence(): fence (VkFence 0x5c5283000000003e[]) is in use. The Vulkan spec states: All queue submission commands that refer to fence must have completed execution (https://vulkan.lunarg.com/doc/view/1.3.275.0/mac/1.3-extensions/vkspec.html#VUID-vkDestroyFence-fence-01120)
                 Object[0] (fence): Handle 0x5c5283000000003e

The errors do not make a lot of sense: the semaphore is signaled few lines before it is passed to vkQueuePresentKHR

Lastly, when resizing the window, the app hangs.

None of these issues happened before the 1.3.275.0 SDK. They also don't happen on any other platform (Windows, Linux, Android) with the same SDK version.

The behavior is the same whether we use the MoltenVK surface or VK_EXT_Metal_Surface

Tutorial01_HelloTriangle.zip

billhollings commented 4 months ago

The Tutorial01_HelloTriangle.app runtime you submitted seems to contain a few hard-coded file paths to your environment.

Do you have a buildable version of that app so we can build it here to see what's happening?

TheMostDiligent commented 4 months ago

Do you have a buildable version of that app so we can build it here to see what's happening?

Yes: https://github.com/DiligentGraphics/DiligentEngine

Build should as simple as

git clone --recursive https://github.com/DiligentGraphics/DiligentEngine.git
cd DiligentEngine
cmake -S . -B ./build/MacOS -G "Xcode"
open build/MacOS/DiligentEngine.xcodeproj/
TheMostDiligent commented 1 month ago

After investigating the issue I found the following: the problem happens right after we recreate the swap chain (for example, when the sync mode changes). The first time we acquire the image, the vkAcquireNextImageKHR returns VK_SUBOPTIMAL_KHR. If, however, we recreate the swap chain the second time and acquire the image again, it works OK. @billhollings does this behavior say something? Apparently it is related to some changes in 1.3.275.0

billhollings commented 3 weeks ago

@TheMostDiligent

Thanks for this latest update, and for posting the workaround patch to your engine. I've had a look at the patch.

I can replicate something similar with the MoltenVK Cube demo, and I believe I understand what's happening, at least with the demo.

Anywhere that VK_SUBOPTIMAL_KHR can be returned (acquire image & present), MoltenVK includes a dynamic comparison of the size of the CAMetalLayer of the underlying view, in order to determine if the platform has resized the layer, typically as a result of the user dragging the edge of a window.

I believe what is happening is that as the user continues to drag the window edge, by the time the swap chain has been re-created, the platform may have expanded or shrunk the CAMetalLayer further, resulting in another VK_SUBOPTIMAL_KHR being triggered. This currently won't stop happening until the user stops changing the size of the window.

So...we need a mechanism to detect when the platform has changed the size of the CAMetalLayer, but perhaps checking on both image acquisition time AND presentation request time is too aggressive. Perhaps we should check only once per frame after the presentation. Or perhaps even only once every , 5 or 10 frames, etc.

Does my description match what you're experiencing? Is this repetition occurring during continuous window resizing by the user?

TheMostDiligent commented 3 weeks ago

I did see similar problem when resizing the window. However, it also happens when the swap chain is recreated (to change the swap effect) without resizing.