bkaradzic / bgfx

Cross-platform, graphics API agnostic, "Bring Your Own Engine/Framework" style rendering library.
https://bkaradzic.github.io/bgfx/overview.html
BSD 2-Clause "Simplified" License
15.05k stars 1.94k forks source link

[vulkan] Synchronization issue: vkAcquireNextImage uses a semaphore which is in use. #3302

Closed mcourteaux closed 5 months ago

mcourteaux commented 5 months ago

I'm still analyzing this issue, but I wanted to document it a bit better than trying to keep things in my head until I understand it enough to fix it.

To Reproduce

  1. Compile bgfx in debug mode with examples.
  2. Run example-01-cubesDebug
  3. Observe the output:
../../../src/renderer_vk.cpp (683): BGFX ---E-           Semaphore, Validation, 0: Validation Error: [ VUID-vkAcquireNextImageKHR-semaphore-01779 ] Object 0: handle = 0xb3ee8b0000000070, type = VK_OBJECT_TYPE_SEMAPHORE; | MessageID = 0x5717e75b | vkAcquireNextImageKHR():  Semaphore must not have any pending operations. The Vulkan spec states: If semaphore is not VK_NULL_HANDLE it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.3.280.0/linux/1.3-extensions/vkspec.html#VUID-vkAcquireNextImageKHR-semaphore-01779)

My interpretation so far After mentally untangling the code for more than an hour now, and reading Vulkan documentation, I sort of conclude the following:

If I understand stuff correctly, bgfx has not followed this guideline. bgfx's codepath to vkAcquireNextImage does not seem to wait for anything, hence the issue. More precisely, it seems that it actually has the waiting backwards:

https://github.com/bkaradzic/bgfx/blob/530a558b11afa4375bdb926b398dbe65a5fc6b4b/src/renderer_vk.cpp#L7408-L7474

It first tries to acquire the next image, and THEN waits for the previous frame to finish, by waiting for the backBufferFence of the newly returned frame (which should correspond to frame N - 2). I believe I understand why the original author of this code (@pezcode) has written it this way: the documentation of vkAcquireNextImage() explicitly states:

The order in which images are acquired is implementation-dependent, and may be different than the order the images were presented.

So I believe this non-deterministic ordering of "next image" from the swapchain is why the synchronization is programmed only when it's known which one is the next one.

The funny thing is, kick() submits the command buffer to the queue, and passes a fence to be signaled upon completion: m_completedFence, which is the fence that should be waited for, but actually never is waited for.

https://github.com/bkaradzic/bgfx/blob/530a558b11afa4375bdb926b398dbe65a5fc6b4b/src/renderer_vk.cpp#L7931-L7977

Only if the passed _wait is true, then a fully synchronized submit-and-wait is executed. Otherwise, this fence is never used again.

So, the missing thing in bgfx right now seems to be the wait for this fence upon acquiring the next image. To summarize what synchronization I found by analyzing the code:

  1. Context::renderFrame()
    1. RendererContextVK::submit()
      1. RendererContextVK::setFrameBuffer() -> FrameBufferVK::acquire():
        • vkAcquireNextImageKHR(): signal_semaphore: m_lastImageAcquiredSemaphore
        • vkWaitForFences(): wait_fence = m_completedFence
      2. RendererContextVK::kick() -> CommandQueueVK::kick(false)
        • vkQueueSubmit(): wait_semaphore = m_lastImageAcquiredSemaphore; signal_semaphore = m_lastImageRenderedSemaphore; signal_fence = m_completedFence
    2. Context::flip() -> RendererContextVK::flip() -> FrameBufferVK::present() -> SwapChainVK::present()
      • vkQueuePresentKHR(): wait_semaphore = m_lastImageRenderedSemaphore

As you can see, the fence is not waited for before acquiring frame N + 2. Unrolling the above synchronization loop 3 times, and identifying the fences, makes this clearer:

  1. [Frame N; frame in flight = 0] Context::renderFrame()
    1. RendererContextVK::submit()
      1. RendererContextVK::setFrameBuffer() -> FrameBufferVK::acquire():
        • vkAcquireNextImageKHR(): signal_semaphore: m_lastImageAcquiredSemaphore (semA 0)
        • vkWaitForFences(): wait_fence = m_completedFence (fenceC 0)
      2. RendererContextVK::kick() -> CommandQueueVK::kick(false)
        • vkQueueSubmit(): wait_semaphore = m_lastImageAcquiredSemaphore (semA 0); signal_semaphore = m_lastImageRenderedSemaphore (semR 0); signal_fence = m_completedFence (fenceC 0)
    2. Context::flip() -> RendererContextVK::flip() -> FrameBufferVK::present() -> SwapChainVK::present()
      • vkQueuePresentKHR(): wait_semaphore = m_lastImageRenderedSemaphore (semR 0)
  2. [Frame N + 1; frame in flight = 1] Context::renderFrame()
    1. RendererContextVK::submit()
      1. RendererContextVK::setFrameBuffer() -> FrameBufferVK::acquire():
        • vkAcquireNextImageKHR(): signal_semaphore: m_lastImageAcquiredSemaphore (semA 1)
        • vkWaitForFences(): wait_fence = m_completedFence (fenceC 1)
      2. RendererContextVK::kick() -> CommandQueueVK::kick(false)
        • vkQueueSubmit(): wait_semaphore = m_lastImageAcquiredSemaphore (semA 1); signal_semaphore = m_lastImageRenderedSemaphore (semR 1); signal_fence = m_completedFence (fenceC 1)
    2. Context::flip() -> RendererContextVK::flip() -> FrameBufferVK::present() -> SwapChainVK::present()
      • vkQueuePresentKHR(): wait_semaphore = m_lastImageRenderedSemaphore (semR 1)
  3. [Frame N + 2; frame in flight = 0] Context::renderFrame()
    1. RendererContextVK::submit()
      1. RendererContextVK::setFrameBuffer() -> FrameBufferVK::acquire():
        • vkAcquireNextImageKHR(): signal_semaphore: m_lastImageAcquiredSemaphore (semA 0)
        • vkWaitForFences(): wait_fence = m_completedFence (fenceC 0) (waiting for fence from frame N)
      2. RendererContextVK::kick() -> CommandQueueVK::kick(false)
        • vkQueueSubmit(): wait_semaphore = m_lastImageAcquiredSemaphore (semA 0); signal_semaphore = m_lastImageRenderedSemaphore (semR 0); signal_fence = m_completedFence (fenceC 0)
    2. Context::flip() -> RendererContextVK::flip() -> FrameBufferVK::present() -> SwapChainVK::present()
      • vkQueuePresentKHR(): wait_semaphore = m_lastImageRenderedSemaphore (semR 0)

I marked the erroneous vkWaitForFences in bold in frame N+2, where it waits for frame N.

mcourteaux commented 5 months ago

I can confirm the error from Vulkan goes away when explicitly kick(true) instead of kick(false) which causes to immediately synchronize on the submitted command buffer.

bkaradzic commented 5 months ago

cc @pezcode

mcourteaux commented 5 months ago

I managed to make some changes that seems to fix the issue, without breaking anything. Needs more testing. Will open PR if I think it's ready for review.

pezcode commented 5 months ago

As you can see, the fence is not waited for before acquiring frame N + 2

The issue is that (as you mentioned) the image order is not guaranteed. So you can not know the next image, and hence can not know the associated fence before calling vkAcquireNextImageKHR(). Not sure what the correct fix is here, but I feel like the missing synchronization should happen at another point.

Calling kick(true) works because it is essentially vkQueueWaitIdle(), but destroys any kind of performance gains by having multiple frames in flight.