Closed fridenmf closed 1 week ago
Similarly I get a VUID-vkResetCommandPool-commandPool-00040 validation error if I re-record the command buffer instead of re-submitting it, same scenario as above (after re-acquire and wait fence on a previously used swapchain image index):
Validation Error: [ VUID-vkResetCommandPool-commandPool-00040 ] Object 0: handle = 0x1ca3802bce0, name = CommandBuffer0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; Object 1: handle = 0xc4f3070000000021, type = VK_OBJECT_TYPE_COMMAND_POOL; | MessageID = 0xb53e2331 | vkResetCommandPool(): (VkCommandBuffer 0x1ca3802bce0[CommandBuffer0]) is in use.
The Vulkan spec states: All VkCommandBuffer objects allocated from commandPool must not be in the pending state (https://vulkan.lunarg.com/doc/view/1.3.296.0/windows/1.3-extensions/vkspec.html#VUID-vkResetCommandPool-commandPool-00040)
If I instead just create new command pools and command buffers each frame without destroying them, then I get past frame 3 but instead get a VUID-vkAcquireNextImageKHR-semaphore-01779 validation error when trying to re-signal a semaphore in vkAcquireNextImageKHR in frame 4, a semaphore that is no longer in use as of frame 3 where the swapchain image index for when it was last used was returned the second time:
Validation Error: [ VUID-vkAcquireNextImageKHR-semaphore-01779 ] Object 0: handle = 0xb9181f0000000029, type = VK_OBJECT_TYPE_SEMAPHORE; | MessageID = 0x5717e75b | vkAcquireNextImageKHR(): Semaphore must not have any pending operations.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.3.296.0/windows/1.3-extensions/vkspec.html#VUID-vkAcquireNextImageKHR-semaphore-01779)
Do you want a separate issues for these or will they be fixed here as well? They seem kind of related.
(I had to edit this comment a few times to get the details right for all three issues)
Do you want a separate issues for these or will they be fixed here as well? They seem kind of related.
Yes, I think it's related. Will try to investigate during this week.
If I instead just create new command pools and command buffers each frame without destroying them, then I get past frame 3 but instead get a VUID-vkAcquireNextImageKHR-semaphore-01779 validation error when trying to re-signal a semaphore in vkAcquireNextImageKHR in frame 4
That's a good experiment, by excluding command buffers from equation this might suggest it is the issue about tracking acquire semaphore. ~The above mentioned that semaphore was used in vkAcquireNextImageKHR in frame 4. In which frame and in which function this semaphore was used the last time before frame 4?~ Nevermind, I checked full API dump, it's possible we do not handle this kind of synchronization. Will try to come up with a test that reproduces this.
I made a trace of a run that excludes the command pool and command buffer by re-creating them every frame and without destructing them, if it helps to look through:
My reasoning is:
The validation error:
Validation Error: [ VUID-vkAcquireNextImageKHR-semaphore-01779 ] Object 0: handle = 0x908683000000001d, type = VK_OBJECT_TYPE_SEMAPHORE; | MessageID = 0x5717e75b | vkAcquireNextImageKHR(): Semaphore must not have any pending operations.
The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://vulkan.lunarg.com/doc/view/1.3.296.0/windows/1.3-extensions/vkspec.html#VUID-vkAcquireNextImageKHR-semaphore-01779)
... claims the semaphore with handle 0x908683000000001d has pending operations during frame 4 in vkAcquireNextImageKHR:
Thread 0, Frame 4, Time 172277 us:
vkAcquireNextImageKHR(device, swapchain, timeout, semaphore, fence, pImageIndex) returns VkResult VK_SUCCESS (0):
device: VkDevice = 000001CD40CC0780
swapchain: VkSwapchainKHR = D897D90000000016
timeout: uint64_t = 18446744073709551615
semaphore: VkSemaphore = 908683000000001D
fence: VkFence = CFCDA0000000001E
pImageIndex: uint32_t* = 1
The last time this was used was as a wait semaphore in frame 0 for a vkQueueSubmit call that wants to render into swapchain image 0:
Thread 0, Frame 0, Time 169358 us:
vkQueueSubmit(queue, submitCount, pSubmits, fence) returns VkResult VK_SUCCESS (0):
queue: VkQueue = 000001CD3C937B60
submitCount: uint32_t = 1
pSubmits: const VkSubmitInfo* = 000000C01D35C220
pSubmits[0]: const VkSubmitInfo = 000000C01D35C220:
sType: VkStructureType = VK_STRUCTURE_TYPE_SUBMIT_INFO (4)
pNext: const void* = NULL
waitSemaphoreCount: uint32_t = 1
pWaitSemaphores: const VkSemaphore* = 000001CD3E5166F8
pWaitSemaphores[0]: const VkSemaphore = 908683000000001D
pWaitDstStageMask: const VkPipelineStageFlags* = 000000C01D35C1F4
pWaitDstStageMask[0]: const VkPipelineStageFlags = 1024 (VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT)
commandBufferCount: uint32_t = 1
pCommandBuffers: const VkCommandBuffer* = 000001CD3E516B58
pCommandBuffers[0]: const VkCommandBuffer = 000001CD4AF17ED0
signalSemaphoreCount: uint32_t = 1
pSignalSemaphores: const VkSemaphore* = 000001CD3E516748
pSignalSemaphores[0]: const VkSemaphore = 301E6C0000000022
fence: VkFence = 0000000000000000
Before that it was used as a signal semaphore in frame 0 for vkAcquireNextImageKHR of swapchain image index 0:
Thread 0, Frame 0, Time 168821 us:
vkAcquireNextImageKHR(device, swapchain, timeout, semaphore, fence, pImageIndex) returns VkResult VK_SUCCESS (0):
device: VkDevice = 000001CD40CC0780
swapchain: VkSwapchainKHR = D897D90000000016
timeout: uint64_t = 18446744073709551615
semaphore: VkSemaphore = 908683000000001D
fence: VkFence = CFCDA0000000001E
pImageIndex: uint32_t* = 0
Frame 3 has acquired swapchain image index 0 the second time, and waited on the fence for that vkAcquireNextImageKHR call:
Thread 0, Frame 3, Time 170693 us:
vkAcquireNextImageKHR(device, swapchain, timeout, semaphore, fence, pImageIndex) returns VkResult VK_SUCCESS (0):
device: VkDevice = 000001CD40CC0780
swapchain: VkSwapchainKHR = D897D90000000016
timeout: uint64_t = 18446744073709551615
semaphore: VkSemaphore = 3FBCD60000000028
fence: VkFence = B9181F0000000029
pImageIndex: uint32_t* = 0
Thread 0, Frame 3, Time 171381 us:
vkWaitForFences(device, fenceCount, pFences, waitAll, timeout) returns VkResult VK_SUCCESS (0):
device: VkDevice = 000001CD40CC0780
fenceCount: uint32_t = 1
pFences: const VkFence* = 000001CD3E516F18
pFences[0]: const VkFence = B9181F0000000029
waitAll: VkBool32 = 1
timeout: uint64_t = 18446744073709551615
This implies that the presentation engine does not have any ongoing jobs for swapchain image 0 as it was returned a second time. ... which implies that vkQueuePresentKHR of frame 0 is done and it's wait semaphore (301E6C0000000022) is signaled:
Thread 0, Frame 0, Time 169523 us:
vkQueuePresentKHR(queue, pPresentInfo) returns VkResult VK_SUCCESS (0):
queue: VkQueue = 000001CD3C937B60
pPresentInfo: const VkPresentInfoKHR* = 000000C01D35C1E0:
sType: VkStructureType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR (1000001001)
pNext: const void* = NULL
waitSemaphoreCount: uint32_t = 1
pWaitSemaphores: const VkSemaphore* = 000001CD3E516748
pWaitSemaphores[0]: const VkSemaphore = 301E6C0000000022
swapchainCount: uint32_t = 1
pSwapchains: const VkSwapchainKHR* = 000001CD48C2D300
pSwapchains[0]: const VkSwapchainKHR = D897D90000000016
pImageIndices: const uint32_t* = 000001CD4AE71CC8
pImageIndices[0]: const uint32_t = 0
pResults: VkResult* = NULL
... which implies that vkQueueSubmit of in frame 0 is done and its wait semaphore (908683000000001D) is also signaled as that's a prerequisite for signaling 301E6C0000000022:
Thread 0, Frame 0, Time 169358 us:
vkQueueSubmit(queue, submitCount, pSubmits, fence) returns VkResult VK_SUCCESS (0):
queue: VkQueue = 000001CD3C937B60
submitCount: uint32_t = 1
pSubmits: const VkSubmitInfo* = 000000C01D35C220
pSubmits[0]: const VkSubmitInfo = 000000C01D35C220:
sType: VkStructureType = VK_STRUCTURE_TYPE_SUBMIT_INFO (4)
pNext: const void* = NULL
waitSemaphoreCount: uint32_t = 1
pWaitSemaphores: const VkSemaphore* = 000001CD3E5166F8
pWaitSemaphores[0]: const VkSemaphore = 908683000000001D
pWaitDstStageMask: const VkPipelineStageFlags* = 000000C01D35C1F4
pWaitDstStageMask[0]: const VkPipelineStageFlags = 1024 (VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT)
commandBufferCount: uint32_t = 1
pCommandBuffers: const VkCommandBuffer* = 000001CD3E516B58
pCommandBuffers[0]: const VkCommandBuffer = 000001CD4AF17ED0
signalSemaphoreCount: uint32_t = 1
pSignalSemaphores: const VkSemaphore* = 000001CD3E516748
pSignalSemaphores[0]: const VkSemaphore = 301E6C0000000022
fence: VkFence = 0000000000000000
Thus 908683000000001D is no longer in use and should be able to be signaled by vkAcquireNextImageKHR in frame 4.
@fridenmf I just finished with a test that I guess repeats you original trace (I started with it so it's easy for me to follow that). If you think I missed something important in the test please let me know. My swapchain has two images. I get errors in the third frame (frame index = 2), that's when we start re-using resources for the first time. It reports error in AcquireNextImage:
Validation Error: [ VUID-vkAcquireNextImageKHR-semaphore-01779 ] Object 0: handle = 0xcfef35000000000a, type = VK_OBJECT_TYPE_SEMAPHORE; | MessageID = 0x5717e75b | vkAcquireNextImageKHR(): Semaphore must not have any pending operations. The Vulkan spec states: If semaphore is not VK_NULL_HANDLE, it must not have any uncompleted signal or wait operations pending (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-vkAcquireNextImageKHR-semaphore-01779)
TEST_F(PositiveWsi, Test) {
AddSurfaceExtension();
RETURN_IF_SKIP(Init());
RETURN_IF_SKIP(InitSwapchain());
const auto swapchain_images = m_swapchain.GetImages();
for (auto image : swapchain_images) {
SetImageLayoutPresentSrc(image);
}
std::vector<vkt::Semaphore> acquire_semaphores;
std::vector<vkt::Fence> acquire_fences;
std::vector<vkt::CommandBuffer> command_buffers;
std::vector<vkt::Semaphore> submit_semaphores;
for (size_t i = 0; i < swapchain_images.size(); i++) {
acquire_semaphores.emplace_back(*m_device);
acquire_fences.emplace_back(*m_device, VK_FENCE_CREATE_SIGNALED_BIT);
command_buffers.emplace_back(*m_device, m_command_pool);
submit_semaphores.emplace_back(*m_device);
}
const int frame_count = 10;
for (int i = 0; i < frame_count; i++) {
const uint32_t frame_index = uint32_t(i % swapchain_images.size());
auto &acquire_semaphore = acquire_semaphores[frame_index];
auto &acquire_fence = acquire_fences[frame_index];
auto &command_buffer = command_buffers[frame_index];
auto &submit_semaphore = submit_semaphores[frame_index];
acquire_fence.Reset();
uint32_t image_index = 0;
vk::AcquireNextImageKHR(device(), m_swapchain, kWaitTimeout, acquire_semaphore, acquire_fence, &image_index);
acquire_fence.Wait(kWaitTimeout);
command_buffer.Begin();
command_buffer.End();
m_default_queue->Submit(command_buffer, vkt::Wait(acquire_semaphore), vkt::Signal(submit_semaphore));
m_default_queue->Present(m_swapchain, image_index, submit_semaphore);
}
m_default_queue->Wait();
}
I don't think you are allowed to do this:
const uint32_t frame_index = uint32_t(i % swapchain_images.size());
auto &acquire_semaphore = acquire_semaphores[frame_index];
AcquireNextImageKHR is not guaranteed to return the indices in order, it could return 0,1,2,1,2,1,2,1,2, and you don't really know that the semaphores used for swapchain image index 0 is done until you get back index 0 from AcquireNextImageKHR.
Since you don't know which swapchain image index you're going to get until you have called AcquireNextImageKHR, you'd have to reuse a semaphore and fence from a pool of "unused semaphores and resources". Only once you've gotten index N from AcquireNextImageKHR, you can put resources back to that pool from swapchain index frames that you have tracked which resources you used them in.
Easiest way to modify your example is to add this outside the frame loop:
vkt::Semaphore unused_semaphore(*m_device);
vkt::Fence unused_fence(*m_device, VK_FENCE_CREATE_SIGNALED_BIT);
Replace this:
auto &acquire_semaphore = acquire_semaphores[frame_index];
auto &acquire_fence = acquire_fences[frame_index];
with this:
auto &acquire_semaphore = unused_semaphore;
auto &acquire_fence = unused_semaphore;
in your frame loop.
And after Present, swap in the unused resources:
std::swap(acquire_semaphores[image_index], acquire_semaphore);
std::swap(acquire_fences[image_index], acquire_fence);
Remove the frame_index
variable, and index the other resources after acquire with image_index instead.
(I've not compiled this, just based it on your code example, but use it as a guide)
don't think you are allowed to do this:
Right, I was aware of this. Just in this test the order was correct and it reproduces the issue, so it's enough to have confirmation and work on solution. I will update the test based on your suggestions.
Updated the test (thanks for the guide!). Can get VUID-vkQueueSubmit-pCommandBuffers-00071
.
TEST_F(PositiveWsi, Test) {
TEST_DESCRIPTION("Acquire fence can be used to determine that submission from previous frame finished.");
AddSurfaceExtension();
RETURN_IF_SKIP(Init());
RETURN_IF_SKIP(InitSwapchain());
const auto swapchain_images = m_swapchain.GetImages();
for (auto image : swapchain_images) {
SetImageLayoutPresentSrc(image);
}
vkt::Semaphore unused_semaphore(*m_device);
vkt::Fence unused_fence(*m_device, VK_FENCE_CREATE_SIGNALED_BIT);
std::vector<vkt::Semaphore> acquire_semaphores;
std::vector<vkt::Fence> acquire_fences;
std::vector<vkt::CommandBuffer> command_buffers;
std::vector<vkt::Semaphore> submit_semaphores;
for (size_t i = 0; i < swapchain_images.size(); i++) {
acquire_semaphores.emplace_back(*m_device);
acquire_fences.emplace_back(*m_device, VK_FENCE_CREATE_SIGNALED_BIT);
command_buffers.emplace_back(*m_device, m_command_pool);
submit_semaphores.emplace_back(*m_device);
command_buffers[i].Begin();
command_buffers[i].End();
}
const int frame_count = 10;
for (int i = 0; i < frame_count; i++) {
auto &acquire_semaphore = unused_semaphore;
auto &acquire_fence = unused_fence;
acquire_fence.Reset();
uint32_t image_index = 0;
vk::AcquireNextImageKHR(device(), m_swapchain, kWaitTimeout, acquire_semaphore, acquire_fence, &image_index);
acquire_fence.Wait(kWaitTimeout);
auto &command_buffer = command_buffers[image_index];
auto &submit_semaphore = submit_semaphores[image_index];
m_default_queue->Submit(command_buffer, vkt::Wait(acquire_semaphore), vkt::Signal(submit_semaphore));
m_default_queue->Present(m_swapchain, image_index, submit_semaphore);
std::swap(acquire_semaphores[image_index], acquire_semaphore);
std::swap(acquire_fences[image_index], acquire_fence);
}
m_default_queue->Wait();
}
Interesting, I would have expected a validation error that you need to reset your command buffer first (if VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT is set), or have a separate pool per swapchain image index and reset that, before you can try to re-record it in your frame loop?
edit: Saw your last edit, now it looks like my example!
Interesting, I would have expected a validation error that you need to reset your command buffer first
Yeah, that's because our test framework creates command pool with VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT flag
@fridenmf I can see what does not work in the current setup, but until this works you can use a slightly modified solution which works: because we wait on the fence immediately after AcquireNextImageKHR, a single fence is sufficient (instead one per swapchain), and then the code hits code path that handles synchronization correctly (tried this).
Here is modified test that shows this. We can reset fence immediately after wait, also no need to create fence in signaled state in this case. Acquire semaphore is also not important, since we wait on fence on the host before submit:
TEST_F(PositiveWsi, WorkaroundUntilFix) {
TEST_DESCRIPTION("Acquire fence can be used to determine that submission from previous frame finished.");
AddSurfaceExtension();
RETURN_IF_SKIP(Init());
RETURN_IF_SKIP(InitSwapchain());
const auto swapchain_images = m_swapchain.GetImages();
for (auto image : swapchain_images) {
SetImageLayoutPresentSrc(image);
}
std::vector<vkt::CommandBuffer> command_buffers;
std::vector<vkt::Semaphore> submit_semaphores;
for (size_t i = 0; i < swapchain_images.size(); i++) {
command_buffers.emplace_back(*m_device, m_command_pool);
command_buffers[i].Begin();
command_buffers[i].End();
submit_semaphores.emplace_back(*m_device);
}
vkt::Fence fence(*m_device);
const int frame_count = 10;
for (int i = 0; i < frame_count; i++) {
auto &acquire_fence = fence;
uint32_t image_index = 0;
vk::AcquireNextImageKHR(device(), m_swapchain, kWaitTimeout, VK_NULL_HANDLE, acquire_fence, &image_index);
acquire_fence.Wait(kWaitTimeout);
acquire_fence.Reset();
auto &command_buffer = command_buffers[image_index];
auto &submit_semaphore = submit_semaphores[image_index];
m_default_queue->Submit(command_buffer, vkt::Signal(submit_semaphore));
m_default_queue->Present(m_swapchain, image_index, submit_semaphore);
}
m_default_queue->Wait();
}
Thanks, this works as a workaround for now!
Interestingly enough though, creating the fence in a signaled state and then doing:
acquire_fence.Reset();
vk::AcquireNextImageKHR(device(), m_swapchain, kWaitTimeout, VK_NULL_HANDLE, acquire_fence, &image_index);
acquire_fence.Wait(kWaitTimeout);
Still triggers:
Validation Error: [ VUID-vkQueueSubmit-pCommandBuffers-00071 ] | MessageID = 0x2e2f4d65 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] VkCommandBuffer 0x25034264b30[] is already in use and is not marked for simultaneous use.
The Vulkan spec states: If any element of the pCommandBuffers member of any element of pSubmits was not recorded with the VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT, it must not be in the pending state (https://vulkan.lunarg.com/doc/view/1.3.296.0/windows/1.3-extensions/vkspec.html#VUID-vkQueueSubmit-pCommandBuffers-00071)
But just changing the fence to be non signaled from creation and to your ordering:
vk::AcquireNextImageKHR(device(), m_swapchain, kWaitTimeout, VK_NULL_HANDLE, acquire_fence, &image_index);
acquire_fence.Wait(kWaitTimeout);
acquire_fence.Reset();
And there is no validation errors. I would argue that creating the fence in a signaled state and reseting before vkAcquireNextImageKHR should work as well?
And there is no validation errors. I would argue that creating the fence in a signaled state and reseting before vkAcquireNextImageKHR should work as well?
Yes
Environment:
Describe the Issue
vkQueueSubmit trigger a false positive VUID-vkQueueSubmit-pCommandBuffers-00071 validation error:
The application create one command buffer for each swapchain image outside the main loop, responsible for all rendering that swapchain index, and re-submits a command buffer each frame, choosing which one depending on the index acquired by vkAcquireNextImageKHR.
Each vkAcquireNextImageKHR is followed by a vkWaitForFences to make sure that the presentation engine is done presenting that particular image the previous time it was used for presentation, following the principles of the swapchain_recreation sample in Vulkan-Samples:
The bug is present in all available Vulkan SDK installers tested since at least a year back:
Expected behavior
The command buffer should no longer be in use after waiting on the fence used with vkAcquireNextImageKHR in frame 3 (referring to Full_trace.txt under Additional context) since the previous present job involving the swapchain image 0 has been completed, and there should be no VUID-vkQueueSubmit-pCommandBuffers-00071 validation error.
Valid Usage ID
Additional context
Trimmed trace showing parts of frame 0, frame 3, and the validation error during the frame 3 vkQueueSubmit: TLDR_trace.txt
Full trace: Full_trace.txt