Closed mcourteaux closed 5 months ago
I can confirm the error from Vulkan goes away when explicitly kick(true)
instead of kick(false)
which causes to immediately synchronize on the submitted command buffer.
cc @pezcode
I managed to make some changes that seems to fix the issue, without breaking anything. Needs more testing. Will open PR if I think it's ready for review.
As you can see, the fence is not waited for before acquiring frame N + 2
The issue is that (as you mentioned) the image order is not guaranteed. So you can not know the next image, and hence can not know the associated fence before calling vkAcquireNextImageKHR()
. Not sure what the correct fix is here, but I feel like the missing synchronization should happen at another point.
Calling kick(true)
works because it is essentially vkQueueWaitIdle()
, but destroys any kind of performance gains by having multiple frames in flight.
I'm still analyzing this issue, but I wanted to document it a bit better than trying to keep things in my head until I understand it enough to fix it.
To Reproduce
example-01-cubesDebug
My interpretation so far After mentally untangling the code for more than an hour now, and reading Vulkan documentation, I sort of conclude the following:
vkAcquireNextImage()
implicitly synchronizes ("waits for") with the point where image N - 2 is presented and ready to be reused for drawing a new frame. It means that it will synchronize with thevkQueuePresent()
from two frames ago (assuming double buffering). Once the vkAcquireNextImage() signals the semaphore, new command buffers can be submitted to that frame.vkAcquireNextImage()
doesn't try to signal into a semaphore which is already in use, because we are trying to acquire frame N+2, whereas frame N is still waiting on this semaphore before it can start rendering.If I understand stuff correctly, bgfx has not followed this guideline. bgfx's codepath to vkAcquireNextImage does not seem to wait for anything, hence the issue. More precisely, it seems that it actually has the waiting backwards:
https://github.com/bkaradzic/bgfx/blob/530a558b11afa4375bdb926b398dbe65a5fc6b4b/src/renderer_vk.cpp#L7408-L7474
It first tries to acquire the next image, and THEN waits for the previous frame to finish, by waiting for the
backBufferFence
of the newly returned frame (which should correspond to frame N - 2). I believe I understand why the original author of this code (@pezcode) has written it this way: the documentation ofvkAcquireNextImage()
explicitly states:So I believe this non-deterministic ordering of "next image" from the swapchain is why the synchronization is programmed only when it's known which one is the next one.
The funny thing is,
kick()
submits the command buffer to the queue, and passes a fence to be signaled upon completion:m_completedFence
, which is the fence that should be waited for, but actually never is waited for.https://github.com/bkaradzic/bgfx/blob/530a558b11afa4375bdb926b398dbe65a5fc6b4b/src/renderer_vk.cpp#L7931-L7977
Only if the passed
_wait
istrue
, then a fully synchronized submit-and-wait is executed. Otherwise, this fence is never used again.So, the missing thing in bgfx right now seems to be the wait for this fence upon acquiring the next image. To summarize what synchronization I found by analyzing the code:
Context::renderFrame()
RendererContextVK::submit()
RendererContextVK::setFrameBuffer()
->FrameBufferVK::acquire()
:vkAcquireNextImageKHR()
: signal_semaphore:m_lastImageAcquiredSemaphore
vkWaitForFences()
: wait_fence =m_completedFence
RendererContextVK::kick()
->CommandQueueVK::kick(false)
vkQueueSubmit()
: wait_semaphore =m_lastImageAcquiredSemaphore
; signal_semaphore =m_lastImageRenderedSemaphore
; signal_fence =m_completedFence
Context::flip()
->RendererContextVK::flip()
->FrameBufferVK::present()
->SwapChainVK::present()
vkQueuePresentKHR()
: wait_semaphore =m_lastImageRenderedSemaphore
As you can see, the fence is not waited for before acquiring frame N + 2. Unrolling the above synchronization loop 3 times, and identifying the fences, makes this clearer:
I marked the erroneous
vkWaitForFences
in bold in frame N+2, where it waits for frame N.