KhronosGroup / Vulkan-ValidationLayers

Vulkan Validation Layers (VVL)
https://vulkan.lunarg.com/doc/sdk/latest/linux/khronos_validation_layer.html
Other
750 stars 401 forks source link

Possible false positive WAW for image layout transition during queue family transfer (QueueSubmit) #7024

Open glebov-andrey opened 9 months ago

glebov-andrey commented 9 months ago

Environment:

Describe the Issue

When VALIDATION_CHECK_ENABLE_SYNCHRONIZATION_VALIDATION_QUEUE_SUBMIT is enabled, I'm getting validation errors related to image layout transitions which happen during queue family transfers. The general structure is such (see full code below):

This results in the following validation error:

Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0x2601c113a00, name = Graphics queue, 
type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x5c0ec5d6 | vkQueueSubmit2():  Hazard WRITE_AFTER_WRITE for entry 0, 
VkCommandBuffer 0x26028e3d430[], Recorded access info (recorded_usage: SYNC_IMAGE_LAYOUT_TRANSITION, command: 
vkCmdPipelineBarrier2, seq_no: 1, reset_no: 1). Access info (prior_usage: SYNC_IMAGE_LAYOUT_TRANSITION, write_barriers: 0, ).

Replacing VK_PIPELINE_STAGE_2_NONE with VK_PIPELINE_STAGE_2_COPY_BIT and VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT in the release and acquire barriers respectively removes the error, which would make sense if the semaphore signal and wait stages weren't ALL_COMMANDS (assuming my understanding of ALL_COMMANDS is correct). Now, from what I understand, ALL_COMMANDS should include any operations which read/write memory, including operations which are not covered by any other stage bits (such as event unsignal operations). So my assumption was that both the queue family release/acquire and image layout transition operations were included in ALL_COMMANDS. Indirectly, VVL seems to agree, since a full ALL_COMMANDS, MEMORY_READ|WRITE barrier is enough to synchronize two layout transition barriers with NONE in their masks.

If this is the case then the error is a false positive, perhaps due to the way they are submitted twice during queue family transfers. On the other hand, if my understanding of ALL_COMMANDS is incorrect then there seems to actually be a false negative instead - the queue family release and acquire operations are unsynchronized too, but aren't reported as such.

In either case there is also the matter of this message making it seem as though there are two image layout transitions (although I can totally see how this weird edge case is tricky to correctly report).

Additional context

Full code ```c++ void vvl_repro(const VkPhysicalDevice physical_device, const VkDevice device, const std::uint32_t graphics_queue_family, const std::uint32_t transfer_queue_family, const VkQueue graphics_queue, const VkQueue transfer_queue) { VkSemaphore semaphore = nullptr; const VkSemaphoreTypeCreateInfo semaphore_type_info = {.sType = VK_STRUCTURE_TYPE_SEMAPHORE_TYPE_CREATE_INFO, .pNext = nullptr, .semaphoreType = VK_SEMAPHORE_TYPE_TIMELINE, .initialValue = 0}; const VkSemaphoreCreateInfo semaphore_create_info = {.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO, .pNext = &semaphore_type_info, .flags = {}}; vkCreateSemaphore(device, &semaphore_create_info, nullptr, &semaphore); const auto allocate_command_pool_and_buffer = [&](const std::uint32_t queue_family_index) -> std::pair { const VkCommandPoolCreateInfo create_info = {.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO, .pNext = nullptr, .flags = VK_COMMAND_POOL_CREATE_TRANSIENT_BIT, .queueFamilyIndex = queue_family_index}; VkCommandPool pool = nullptr; vkCreateCommandPool(device, &create_info, nullptr, &pool); const VkCommandBufferAllocateInfo allocate_info = {.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO, .pNext = nullptr, .commandPool = pool, .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY, .commandBufferCount = 1}; VkCommandBuffer command_buffer = nullptr; vkAllocateCommandBuffers(device, &allocate_info, &command_buffer); return {pool, command_buffer}; }; const auto [graphics_command_pool, graphics_command_buffer] = allocate_command_pool_and_buffer(graphics_queue_family); const auto [transfer_command_pool, transfer_command_buffer] = allocate_command_pool_and_buffer(transfer_queue_family); VkImage image = nullptr; VkDeviceMemory image_memory = nullptr; { const VkImageCreateInfo create_info = {.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, .pNext = nullptr, .flags = {}, .imageType = VK_IMAGE_TYPE_2D, .format = VK_FORMAT_R8G8B8A8_SRGB, .extent = {4, 4, 1}, .mipLevels = 1, .arrayLayers = 1, .samples = VK_SAMPLE_COUNT_1_BIT, .tiling = VK_IMAGE_TILING_OPTIMAL, .usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT, .sharingMode = VK_SHARING_MODE_EXCLUSIVE, .queueFamilyIndexCount = 0, .pQueueFamilyIndices = nullptr, .initialLayout = VK_IMAGE_LAYOUT_UNDEFINED}; vkCreateImage(device, &create_info, nullptr, &image); VkMemoryRequirements memory_requirements = {}; vkGetImageMemoryRequirements(device, image, &memory_requirements); VkPhysicalDeviceMemoryProperties memory_properties = {}; vkGetPhysicalDeviceMemoryProperties(physical_device, &memory_properties); std::uint32_t memory_type = ~0u; for (std::uint32_t idx = 0; idx != memory_properties.memoryTypeCount; ++idx) { if (memory_requirements.memoryTypeBits & (1u << idx)) { memory_type = idx; break; } } const VkMemoryAllocateInfo allocate_info = {.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO, .pNext = nullptr, .allocationSize = memory_requirements.size, .memoryTypeIndex = memory_type}; vkAllocateMemory(device, &allocate_info, nullptr, &image_memory); vkBindImageMemory(device, image, image_memory, 0); } constexpr VkImageSubresourceRange full_subresource = {.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT, .baseMipLevel = 0, .levelCount = VK_REMAINING_MIP_LEVELS, .baseArrayLayer = 0, .layerCount = VK_REMAINING_ARRAY_LAYERS}; { const VkCommandBufferBeginInfo begin_info = {.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO, .pNext = nullptr, .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, .pInheritanceInfo = nullptr}; vkBeginCommandBuffer(transfer_command_buffer, &begin_info); const VkImageMemoryBarrier2 layout_init_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2, .pNext = nullptr, .srcStageMask = VK_PIPELINE_STAGE_2_NONE, .srcAccessMask = VK_ACCESS_2_NONE, .dstStageMask = VK_PIPELINE_STAGE_2_COPY_BIT, .dstAccessMask = VK_ACCESS_2_TRANSFER_WRITE_BIT, .oldLayout = VK_IMAGE_LAYOUT_UNDEFINED, .newLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, .image = image, .subresourceRange = full_subresource}; const VkDependencyInfo layout_init_dependency = {.sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO, .imageMemoryBarrierCount = 1, .pImageMemoryBarriers = &layout_init_barrier}; vkCmdPipelineBarrier2(transfer_command_buffer, &layout_init_dependency); // vkCmdCopyBufferToImage2() omited, but should be OK anyway because of dependency chaining const VkImageMemoryBarrier2 release_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2, .pNext = nullptr, .srcStageMask = VK_PIPELINE_STAGE_2_COPY_BIT, .srcAccessMask = VK_ACCESS_2_TRANSFER_WRITE_BIT, .dstStageMask = VK_PIPELINE_STAGE_2_NONE, .dstAccessMask = VK_ACCESS_2_NONE, .oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, .newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, .srcQueueFamilyIndex = transfer_queue_family, .dstQueueFamilyIndex = graphics_queue_family, .image = image, .subresourceRange = full_subresource}; const VkDependencyInfo release_dependency = {.sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO, .imageMemoryBarrierCount = 1, .pImageMemoryBarriers = &release_barrier}; vkCmdPipelineBarrier2(transfer_command_buffer, &release_dependency); vkEndCommandBuffer(transfer_command_buffer); const VkCommandBufferSubmitInfo command_buffer_info = {.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_SUBMIT_INFO, .pNext = nullptr, .commandBuffer = transfer_command_buffer, .deviceMask = 0}; const VkSemaphoreSubmitInfo semaphore_signal_info = {.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO, .pNext = nullptr, .semaphore = semaphore, .value = 1, .stageMask = VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT, .deviceIndex = 0}; const VkSubmitInfo2 submit_info = {.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO_2, .pNext = nullptr, .flags = {}, .waitSemaphoreInfoCount = 0, .pWaitSemaphoreInfos = nullptr, .commandBufferInfoCount = 1, .pCommandBufferInfos = &command_buffer_info, .signalSemaphoreInfoCount = 1, .pSignalSemaphoreInfos = &semaphore_signal_info}; vkQueueSubmit2(transfer_queue, 1, &submit_info, nullptr); } { const VkCommandBufferBeginInfo begin_info = {.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO, .pNext = nullptr, .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, .pInheritanceInfo = nullptr}; vkBeginCommandBuffer(graphics_command_buffer, &begin_info); const VkImageMemoryBarrier2 acquire_barrier = {.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2, .pNext = nullptr, .srcStageMask = VK_PIPELINE_STAGE_2_NONE, .srcAccessMask = VK_ACCESS_2_NONE, .dstStageMask = VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT, .dstAccessMask = VK_ACCESS_2_SHADER_SAMPLED_READ_BIT, .oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, .newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, .srcQueueFamilyIndex = transfer_queue_family, .dstQueueFamilyIndex = graphics_queue_family, .image = image, .subresourceRange = full_subresource}; const VkDependencyInfo acquire_dependency = {.sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO, .imageMemoryBarrierCount = 1, .pImageMemoryBarriers = &acquire_barrier}; vkCmdPipelineBarrier2(graphics_command_buffer, &acquire_dependency); vkEndCommandBuffer(graphics_command_buffer); const VkCommandBufferSubmitInfo command_buffer_info = {.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_SUBMIT_INFO, .pNext = nullptr, .commandBuffer = graphics_command_buffer, .deviceMask = 0}; const VkSemaphoreSubmitInfo semaphore_wait_info = {.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO, .pNext = nullptr, .semaphore = semaphore, .value = 1, .stageMask = VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT, .deviceIndex = 0}; const VkSubmitInfo2 submit_info = {.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO_2, .pNext = nullptr, .flags = {}, .waitSemaphoreInfoCount = 1, .pWaitSemaphoreInfos = &semaphore_wait_info, .commandBufferInfoCount = 1, .pCommandBufferInfos = &command_buffer_info, .signalSemaphoreInfoCount = 0, .pSignalSemaphoreInfos = nullptr}; vkQueueSubmit2(graphics_queue, 1, &submit_info, nullptr); } vkDeviceWaitIdle(device); vkFreeMemory(device, image_memory, nullptr); vkDestroyImage(device, image, nullptr); vkDestroyCommandPool(device, graphics_command_pool, nullptr); vkDestroyCommandPool(device, transfer_command_pool, nullptr); vkDestroySemaphore(device, semaphore, nullptr); } ```
artem-lunarg commented 9 months ago

@glebov-andrey Synchronization validation does not support timeline semaphores yet. Synchronization validation also does not know about special rules when a barrier also defines queue family transfer operation. This is something I'm planning to work on. Not immediately, but also not a long term.

Thanks for the high quality bug report, that's very helpful.

artem-lunarg commented 9 months ago

Added repro case: https://github.com/artem-lunarg/Vulkan-ValidationLayers/commit/449526917a7f8a25d42471a5f3081c47c207c383

It turns out this issue is reproducible with binaries semaphores too. Queue family transition part can be fixed earlier since we don't need timeline semaphores support.

glebov-andrey commented 9 months ago

Oh, I see - I had noticed the bit about timeline semaphores not being supported in the docs but assumed that must be out-of-date after the whole saga with #4968. Thanks for looking into this anyway!

So I take it I was correct in assuming that ALL_COMMANDS in the semaphore signal/wait submission should be enough for any layout transitions or queue family release/acquire operations with NONE in their stage masks?

I've also checked with a binary semaphore, and the same kind of error occurs when there's no queue family transfer:

Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0x1c1f
f0fff80, name = Graphics queue, type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x5c0ec5d6 | vkQueueSubmit2():  Hazard WRITE_
AFTER_WRITE for entry 0, VkCommandBuffer 0x1c18d193b10[], Recorded access info (recorded_usage: SYNC_IMAGE_LAYOUT_TRANSI
TION, command: vkCmdPipelineBarrier2, seq_no: 1, reset_no: 1). Access info (prior_usage: SYNC_IMAGE_LAYOUT_TRANSITION, w
rite_barriers: SYNC_DRAW_INDIRECT_INDIRECT_COMMAND_READ|SYNC_DRAW_INDIRECT_TRANSFORM_FEEDBACK_COUNTER_READ_EXT|SYNC_VERT
EX_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_VERTEX_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_VERTEX_SHADER_SHADER_BINDING_TA
BLE_READ|SYNC_VERTEX_SHADER_SHADER_SAMPLED_READ|SYNC_VERTEX_SHADER_SHADER_STORAGE_READ|SYNC_VERTEX_SHADER_SHADER_STORAGE
_WRITE|SYNC_VERTEX_SHADER_UNIFORM_READ|SYNC_TESSELLATION_CONTROL_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_TESSELLATION_CO
NTROL_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_TESSELLATION_CONTROL_SHADER_SHADER_BINDING_TABLE_READ|SYNC_TESSELLATION_CON
TROL_SHADER_SHADER_SAMPLED_READ|SYNC_TESSELLATION_CONTROL_SHADER_SHADER_STORAGE_READ|SYNC_TESSELLATION_CONTROL_SHADER_SH
ADER_STORAGE_WRITE|SYNC_TESSELLATION_CONTROL_SHADER_UNIFORM_READ|SYNC_TESSELLATION_EVALUATION_SHADER_ACCELERATION_STRUCT
URE_READ|SYNC_TESSELLATION_EVALUATION_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_BINDI
NG_TABLE_READ|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_SAMPLED_READ|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_STORAGE
_READ|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_STORAGE_WRITE|SYNC_TESSELLATION_EVALUATION_SHADER_UNIFORM_READ|SYNC_GEO
METRY_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_GEOMETRY_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_GEOMETRY_SHADER_SHADER_BIN
DING_TABLE_READ|SYNC_GEOMETRY_SHADER_SHADER_SAMPLED_READ|SYNC_GEOMETRY_SHADER_SHADER_STORAGE_READ|SYNC_GEOMETRY_SHADER_S
HADER_STORAGE_WRITE|SYNC_GEOMETRY_SHADER_UNIFORM_READ|SYNC_FRAGMENT_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_FRAGMENT_SHA
DER_COLOR_ATTACHMENT_READ|SYNC_FRAGMENT_SHADER_DEPTH_STENCIL_ATTACHMENT_READ|SYNC_FRAGMENT_SHADER_DESCRIPTOR_BUFFER_READ
_EXT|SYNC_FRAGMENT_SHADER_INPUT_ATTACHMENT_READ|SYNC_FRAGMENT_SHADER_SHADER_BINDING_TABLE_READ|SYNC_FRAGMENT_SHADER_SHAD
ER_SAMPLED_READ|SYNC_FRAGMENT_SHADER_SHADER_STORAGE_READ|SYNC_FRAGMENT_SHADER_SHADER_STORAGE_WRITE|SYNC_FRAGMENT_SHADER_
UNIFORM_READ|SYNC_EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ|SYNC_EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_
WRITE|SYNC_LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ|SYNC_LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_WRITE|SYN
C_COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_READ|SYNC_COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_READ_NONCOHERENT_EXT|SYNC_
COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_WRITE|SYNC_COMPUTE_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_COMPUTE_SHADER_DESCR
IPTOR_BUFFER_READ_EXT|SYNC_COMPUTE_SHADER_SHADER_BINDING_TABLE_READ|SYNC_COMPUTE_SHADER_SHADER_SAMPLED_READ|SYNC_COMPUTE
_SHADER_SHADER_STORAGE_READ|SYNC_COMPUTE_SHADER_SHADER_STORAGE_WRITE|SYNC_COMPUTE_SHADER_UNIFORM_READ|SYNC_HOST_HOST_REA
D|SYNC_HOST_HOST_WRITE|SYNC_COMMAND_PREPROCESS_NV_COMMAND_PREPROCESS_READ_NV|SYNC_COMMAND_PREPROCESS_NV_COMMAND_PREPROCE
SS_WRITE_NV|SYNC_CONDITIONAL_RENDERING_EXT_CONDITIONAL_RENDERING_READ_EXT|SYNC_TASK_SHADER_EXT_ACCELERATION_STRUCTURE_RE
AD|SYNC_TASK_SHADER_EXT_DESCRIPTOR_BUFFER_READ_EXT|SYNC_TASK_SHADER_EXT_SHADER_BINDING_TABLE_READ|SYNC_TASK_SHADER_EXT_S
HADER_SAMPLED_READ|SYNC_TASK_SHADER_EXT_SHADER_STORAGE_READ|SYNC_TASK_SHADER_EXT_SHADER_STORAGE_WRITE|SYNC_TASK_SHADER_E
XT_UNIFORM_READ|SYNC_MESH_SHADER_EXT_ACCELERATION_STRUCTURE_READ|SYNC_MESH_SHADER_EXT_DESCRIPTOR_BUFFER_READ_EXT|SYNC_ME
SH_SHADER_EXT_SHADER_BINDING_TABLE_READ|SYNC_MESH_SHADER_EXT_SHADER_SAMPLED_READ|SYNC_MESH_SHADER_EXT_SHADER_STORAGE_REA
D|SYNC_MESH_SHADER_EXT_SHADER_STORAGE_WRITE|SYNC_MESH_SHADER_EXT_UNIFORM_READ|SYNC_RAY_TRACING_SHADER_ACCELERATION_STRUC
TURE_READ|SYNC_RAY_TRACING_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_RAY_TRACING_SHADER_SHADER_BINDING_TABLE_READ|SYNC_RAY_
TRACING_SHADER_SHADER_SAMPLED_READ|SYNC_RAY_TRACING_SHADER_SHADER_STORAGE_READ|SYNC_RAY_TRACING_SHADER_SHADER_STORAGE_WR
ITE|SYNC_RAY_TRACING_SHADER_UNIFORM_READ|SYNC_FRAGMENT_SHADING_RATE_ATTACHMENT_FRAGMENT_SHADING_RATE_ATTACHMENT_READ|SYN
C_FRAGMENT_DENSITY_PROCESS_EXT_FRAGMENT_DENSITY_MAP_READ_EXT|SYNC_TRANSFORM_FEEDBACK_EXT_TRANSFORM_FEEDBACK_COUNTER_READ
_EXT|SYNC_TRANSFORM_FEEDBACK_EXT_TRANSFORM_FEEDBACK_COUNTER_WRITE_EXT|SYNC_TRANSFORM_FEEDBACK_EXT_TRANSFORM_FEEDBACK_WRI
TE_EXT|SYNC_ACCELERATION_STRUCTURE_BUILD_ACCELERATION_STRUCTURE_READ|SYNC_ACCELERATION_STRUCTURE_BUILD_ACCELERATION_STRU
CTURE_WRITE|SYNC_ACCELERATION_STRUCTURE_BUILD_INDIRECT_COMMAND_READ|SYNC_ACCELERATION_STRUCTURE_BUILD_MICROMAP_READ_EXT|
SYNC_ACCELERATION_STRUCTURE_BUILD_SHADER_STORAGE_READ|SYNC_ACCELERATION_STRUCTURE_BUILD_TRANSFER_READ|SYNC_ACCELERATION_
STRUCTURE_BUILD_TRANSFER_WRITE|SYNC_ACCELERATION_STRUCTURE_COPY_ACCELERATION_STRUCTURE_READ|SYNC_ACCELERATION_STRUCTURE_
COPY_ACCELERATION_STRUCTURE_WRITE|SYNC_ACCELERATION_STRUCTURE_COPY_TRANSFER_READ|SYNC_ACCELERATION_STRUCTURE_COPY_TRANSF
ER_WRITE|SYNC_MICROMAP_BUILD_EXT_MICROMAP_READ_EXT|SYNC_MICROMAP_BUILD_EXT_MICROMAP_WRITE_EXT|SYNC_MICROMAP_BUILD_EXT_TR
ANSFER_READ|SYNC_MICROMAP_BUILD_EXT_TRANSFER_WRITE|SYNC_COPY_TRANSFER_READ|SYNC_COPY_TRANSFER_WRITE|SYNC_RESOLVE_TRANSFE
R_READ|SYNC_RESOLVE_TRANSFER_WRITE|SYNC_BLIT_TRANSFER_READ|SYNC_BLIT_TRANSFER_WRITE|SYNC_CLEAR_TRANSFER_WRITE|SYNC_INDEX
_INPUT_INDEX_READ|SYNC_VERTEX_ATTRIBUTE_INPUT_VERTEX_ATTRIBUTE_READ|SYNC_SUBPASS_SHADER_HUAWEI_ACCELERATION_STRUCTURE_RE
AD|SYNC_SUBPASS_SHADER_HUAWEI_DESCRIPTOR_BUFFER_READ_EXT|SYNC_SUBPASS_SHADER_HUAWEI_INPUT_ATTACHMENT_READ|SYNC_SUBPASS_S
HADER_HUAWEI_SHADER_BINDING_TABLE_READ|SYNC_SUBPASS_SHADER_HUAWEI_SHADER_SAMPLED_READ|SYNC_SUBPASS_SHADER_HUAWEI_SHADER_
STORAGE_READ|SYNC_SUBPASS_SHADER_HUAWEI_SHADER_STORAGE_WRITE|SYNC_SUBPASS_SHADER_HUAWEI_UNIFORM_READ|SYNC_INVOCATION_MAS
K_HUAWEI_INVOCATION_MASK_READ_HUAWEI|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_ACCELERATION_STRUCTURE_READ|SYNC_CLUSTER_CULLING
_SHADER_HUAWEI_DESCRIPTOR_BUFFER_READ_EXT|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_SHADER_BINDING_TABLE_READ|SYNC_CLUSTER_CULL
ING_SHADER_HUAWEI_SHADER_SAMPLED_READ|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_SHADER_STORAGE_READ|SYNC_CLUSTER_CULLING_SHADER
_HUAWEI_SHADER_STORAGE_WRITE|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_UNIFORM_READ, ).

artem-lunarg commented 9 months ago

So I take it I was correct in assuming that ALL_COMMANDS in the semaphore signal/wait submission should be enough for any layout transitions or queue family release/acquire operations with NONE in their stage masks?

That was my conclusion too. Semaphore creates necessary execution dependency required by release/acquire parts. One reason you might want to use stage masks in release/acquire is to synchronize with other commands in the same command buffer (or the same queue), and that's not needed for this example.

devshgraphicsprogramming commented 6 months ago

Is concurrent sharing of the resource a solution to still getting useful syncval and no false errors then?

artem-lunarg commented 6 months ago

@devshgraphicsprogramming yes, if that's an option, to avoid ownership transfer (potentially can have negative perf impact, but context here is only validation).

Also for the apps that have an option to switch between timeline and binary semaphores, using binary semaphores when syncval is enabled is a workaround until timeline is implemented, but if not, then timeline support work is starting really soon. Please stay tuned.

artem-lunarg commented 1 day ago

@glebov-andrey not directly related to this issue but FYI, since I remember you had pretty advanced synchronization scheme.

The latest VVL code (after https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/8519 and https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/8546) has major update of timeline semaphore tracking in the Core validation and handles new use cases. In practice it means that in-use resource tracking is more accurate. If there are any issues with this new implementation please let us now.

Timeline semaphores support for Synchronization validation is not merged yet but there is a branch https://github.com/artem-lunarg/Vulkan-ValidationLayers/tree/artem-timelines-wip12 with all planned functionality implemented, in case you want to have an early preview.

Queue ownership transfer for syncval is still not supported (hopefully after timeline semaphores), so related issues are still there.

glebov-andrey commented 8 hours ago

@artem-lunarg Thanks for the heads up, this is great news! I'll check out the changes soon, and share some feedback if anything comes up.

artem-lunarg commented 5 hours ago

Thanks, for syncval timelines we now have a draft PR which will be updated with new fixes: https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/8559