Open glebov-andrey opened 9 months ago
@glebov-andrey Synchronization validation does not support timeline semaphores yet. Synchronization validation also does not know about special rules when a barrier also defines queue family transfer operation. This is something I'm planning to work on. Not immediately, but also not a long term.
Thanks for the high quality bug report, that's very helpful.
Added repro case: https://github.com/artem-lunarg/Vulkan-ValidationLayers/commit/449526917a7f8a25d42471a5f3081c47c207c383
It turns out this issue is reproducible with binaries semaphores too. Queue family transition part can be fixed earlier since we don't need timeline semaphores support.
Oh, I see - I had noticed the bit about timeline semaphores not being supported in the docs but assumed that must be out-of-date after the whole saga with #4968. Thanks for looking into this anyway!
So I take it I was correct in assuming that ALL_COMMANDS
in the semaphore signal/wait submission should be enough for any layout transitions or queue family release/acquire operations with NONE
in their stage masks?
I've also checked with a binary semaphore, and the same kind of error occurs when there's no queue family transfer:
EXCLUSIVE
with CONCURRENT
, add queueFamilyIndices
to ImageCreateInfo
VK_QUEUE_FAMILY_IGNORED
in the acquire barrier
Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0x1c1f
f0fff80, name = Graphics queue, type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x5c0ec5d6 | vkQueueSubmit2(): Hazard WRITE_
AFTER_WRITE for entry 0, VkCommandBuffer 0x1c18d193b10[], Recorded access info (recorded_usage: SYNC_IMAGE_LAYOUT_TRANSI
TION, command: vkCmdPipelineBarrier2, seq_no: 1, reset_no: 1). Access info (prior_usage: SYNC_IMAGE_LAYOUT_TRANSITION, w
rite_barriers: SYNC_DRAW_INDIRECT_INDIRECT_COMMAND_READ|SYNC_DRAW_INDIRECT_TRANSFORM_FEEDBACK_COUNTER_READ_EXT|SYNC_VERT
EX_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_VERTEX_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_VERTEX_SHADER_SHADER_BINDING_TA
BLE_READ|SYNC_VERTEX_SHADER_SHADER_SAMPLED_READ|SYNC_VERTEX_SHADER_SHADER_STORAGE_READ|SYNC_VERTEX_SHADER_SHADER_STORAGE
_WRITE|SYNC_VERTEX_SHADER_UNIFORM_READ|SYNC_TESSELLATION_CONTROL_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_TESSELLATION_CO
NTROL_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_TESSELLATION_CONTROL_SHADER_SHADER_BINDING_TABLE_READ|SYNC_TESSELLATION_CON
TROL_SHADER_SHADER_SAMPLED_READ|SYNC_TESSELLATION_CONTROL_SHADER_SHADER_STORAGE_READ|SYNC_TESSELLATION_CONTROL_SHADER_SH
ADER_STORAGE_WRITE|SYNC_TESSELLATION_CONTROL_SHADER_UNIFORM_READ|SYNC_TESSELLATION_EVALUATION_SHADER_ACCELERATION_STRUCT
URE_READ|SYNC_TESSELLATION_EVALUATION_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_BINDI
NG_TABLE_READ|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_SAMPLED_READ|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_STORAGE
_READ|SYNC_TESSELLATION_EVALUATION_SHADER_SHADER_STORAGE_WRITE|SYNC_TESSELLATION_EVALUATION_SHADER_UNIFORM_READ|SYNC_GEO
METRY_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_GEOMETRY_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_GEOMETRY_SHADER_SHADER_BIN
DING_TABLE_READ|SYNC_GEOMETRY_SHADER_SHADER_SAMPLED_READ|SYNC_GEOMETRY_SHADER_SHADER_STORAGE_READ|SYNC_GEOMETRY_SHADER_S
HADER_STORAGE_WRITE|SYNC_GEOMETRY_SHADER_UNIFORM_READ|SYNC_FRAGMENT_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_FRAGMENT_SHA
DER_COLOR_ATTACHMENT_READ|SYNC_FRAGMENT_SHADER_DEPTH_STENCIL_ATTACHMENT_READ|SYNC_FRAGMENT_SHADER_DESCRIPTOR_BUFFER_READ
_EXT|SYNC_FRAGMENT_SHADER_INPUT_ATTACHMENT_READ|SYNC_FRAGMENT_SHADER_SHADER_BINDING_TABLE_READ|SYNC_FRAGMENT_SHADER_SHAD
ER_SAMPLED_READ|SYNC_FRAGMENT_SHADER_SHADER_STORAGE_READ|SYNC_FRAGMENT_SHADER_SHADER_STORAGE_WRITE|SYNC_FRAGMENT_SHADER_
UNIFORM_READ|SYNC_EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ|SYNC_EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_
WRITE|SYNC_LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ|SYNC_LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_WRITE|SYN
C_COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_READ|SYNC_COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_READ_NONCOHERENT_EXT|SYNC_
COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_WRITE|SYNC_COMPUTE_SHADER_ACCELERATION_STRUCTURE_READ|SYNC_COMPUTE_SHADER_DESCR
IPTOR_BUFFER_READ_EXT|SYNC_COMPUTE_SHADER_SHADER_BINDING_TABLE_READ|SYNC_COMPUTE_SHADER_SHADER_SAMPLED_READ|SYNC_COMPUTE
_SHADER_SHADER_STORAGE_READ|SYNC_COMPUTE_SHADER_SHADER_STORAGE_WRITE|SYNC_COMPUTE_SHADER_UNIFORM_READ|SYNC_HOST_HOST_REA
D|SYNC_HOST_HOST_WRITE|SYNC_COMMAND_PREPROCESS_NV_COMMAND_PREPROCESS_READ_NV|SYNC_COMMAND_PREPROCESS_NV_COMMAND_PREPROCE
SS_WRITE_NV|SYNC_CONDITIONAL_RENDERING_EXT_CONDITIONAL_RENDERING_READ_EXT|SYNC_TASK_SHADER_EXT_ACCELERATION_STRUCTURE_RE
AD|SYNC_TASK_SHADER_EXT_DESCRIPTOR_BUFFER_READ_EXT|SYNC_TASK_SHADER_EXT_SHADER_BINDING_TABLE_READ|SYNC_TASK_SHADER_EXT_S
HADER_SAMPLED_READ|SYNC_TASK_SHADER_EXT_SHADER_STORAGE_READ|SYNC_TASK_SHADER_EXT_SHADER_STORAGE_WRITE|SYNC_TASK_SHADER_E
XT_UNIFORM_READ|SYNC_MESH_SHADER_EXT_ACCELERATION_STRUCTURE_READ|SYNC_MESH_SHADER_EXT_DESCRIPTOR_BUFFER_READ_EXT|SYNC_ME
SH_SHADER_EXT_SHADER_BINDING_TABLE_READ|SYNC_MESH_SHADER_EXT_SHADER_SAMPLED_READ|SYNC_MESH_SHADER_EXT_SHADER_STORAGE_REA
D|SYNC_MESH_SHADER_EXT_SHADER_STORAGE_WRITE|SYNC_MESH_SHADER_EXT_UNIFORM_READ|SYNC_RAY_TRACING_SHADER_ACCELERATION_STRUC
TURE_READ|SYNC_RAY_TRACING_SHADER_DESCRIPTOR_BUFFER_READ_EXT|SYNC_RAY_TRACING_SHADER_SHADER_BINDING_TABLE_READ|SYNC_RAY_
TRACING_SHADER_SHADER_SAMPLED_READ|SYNC_RAY_TRACING_SHADER_SHADER_STORAGE_READ|SYNC_RAY_TRACING_SHADER_SHADER_STORAGE_WR
ITE|SYNC_RAY_TRACING_SHADER_UNIFORM_READ|SYNC_FRAGMENT_SHADING_RATE_ATTACHMENT_FRAGMENT_SHADING_RATE_ATTACHMENT_READ|SYN
C_FRAGMENT_DENSITY_PROCESS_EXT_FRAGMENT_DENSITY_MAP_READ_EXT|SYNC_TRANSFORM_FEEDBACK_EXT_TRANSFORM_FEEDBACK_COUNTER_READ
_EXT|SYNC_TRANSFORM_FEEDBACK_EXT_TRANSFORM_FEEDBACK_COUNTER_WRITE_EXT|SYNC_TRANSFORM_FEEDBACK_EXT_TRANSFORM_FEEDBACK_WRI
TE_EXT|SYNC_ACCELERATION_STRUCTURE_BUILD_ACCELERATION_STRUCTURE_READ|SYNC_ACCELERATION_STRUCTURE_BUILD_ACCELERATION_STRU
CTURE_WRITE|SYNC_ACCELERATION_STRUCTURE_BUILD_INDIRECT_COMMAND_READ|SYNC_ACCELERATION_STRUCTURE_BUILD_MICROMAP_READ_EXT|
SYNC_ACCELERATION_STRUCTURE_BUILD_SHADER_STORAGE_READ|SYNC_ACCELERATION_STRUCTURE_BUILD_TRANSFER_READ|SYNC_ACCELERATION_
STRUCTURE_BUILD_TRANSFER_WRITE|SYNC_ACCELERATION_STRUCTURE_COPY_ACCELERATION_STRUCTURE_READ|SYNC_ACCELERATION_STRUCTURE_
COPY_ACCELERATION_STRUCTURE_WRITE|SYNC_ACCELERATION_STRUCTURE_COPY_TRANSFER_READ|SYNC_ACCELERATION_STRUCTURE_COPY_TRANSF
ER_WRITE|SYNC_MICROMAP_BUILD_EXT_MICROMAP_READ_EXT|SYNC_MICROMAP_BUILD_EXT_MICROMAP_WRITE_EXT|SYNC_MICROMAP_BUILD_EXT_TR
ANSFER_READ|SYNC_MICROMAP_BUILD_EXT_TRANSFER_WRITE|SYNC_COPY_TRANSFER_READ|SYNC_COPY_TRANSFER_WRITE|SYNC_RESOLVE_TRANSFE
R_READ|SYNC_RESOLVE_TRANSFER_WRITE|SYNC_BLIT_TRANSFER_READ|SYNC_BLIT_TRANSFER_WRITE|SYNC_CLEAR_TRANSFER_WRITE|SYNC_INDEX
_INPUT_INDEX_READ|SYNC_VERTEX_ATTRIBUTE_INPUT_VERTEX_ATTRIBUTE_READ|SYNC_SUBPASS_SHADER_HUAWEI_ACCELERATION_STRUCTURE_RE
AD|SYNC_SUBPASS_SHADER_HUAWEI_DESCRIPTOR_BUFFER_READ_EXT|SYNC_SUBPASS_SHADER_HUAWEI_INPUT_ATTACHMENT_READ|SYNC_SUBPASS_S
HADER_HUAWEI_SHADER_BINDING_TABLE_READ|SYNC_SUBPASS_SHADER_HUAWEI_SHADER_SAMPLED_READ|SYNC_SUBPASS_SHADER_HUAWEI_SHADER_
STORAGE_READ|SYNC_SUBPASS_SHADER_HUAWEI_SHADER_STORAGE_WRITE|SYNC_SUBPASS_SHADER_HUAWEI_UNIFORM_READ|SYNC_INVOCATION_MAS
K_HUAWEI_INVOCATION_MASK_READ_HUAWEI|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_ACCELERATION_STRUCTURE_READ|SYNC_CLUSTER_CULLING
_SHADER_HUAWEI_DESCRIPTOR_BUFFER_READ_EXT|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_SHADER_BINDING_TABLE_READ|SYNC_CLUSTER_CULL
ING_SHADER_HUAWEI_SHADER_SAMPLED_READ|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_SHADER_STORAGE_READ|SYNC_CLUSTER_CULLING_SHADER
_HUAWEI_SHADER_STORAGE_WRITE|SYNC_CLUSTER_CULLING_SHADER_HUAWEI_UNIFORM_READ, ).
So I take it I was correct in assuming that ALL_COMMANDS in the semaphore signal/wait submission should be enough for any layout transitions or queue family release/acquire operations with NONE in their stage masks?
That was my conclusion too. Semaphore creates necessary execution dependency required by release/acquire parts. One reason you might want to use stage masks in release/acquire is to synchronize with other commands in the same command buffer (or the same queue), and that's not needed for this example.
Is concurrent sharing of the resource a solution to still getting useful syncval and no false errors then?
@devshgraphicsprogramming yes, if that's an option, to avoid ownership transfer (potentially can have negative perf impact, but context here is only validation).
Also for the apps that have an option to switch between timeline and binary semaphores, using binary semaphores when syncval is enabled is a workaround until timeline is implemented, but if not, then timeline support work is starting really soon. Please stay tuned.
@glebov-andrey not directly related to this issue but FYI, since I remember you had pretty advanced synchronization scheme.
The latest VVL code (after https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/8519 and https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/8546) has major update of timeline semaphore tracking in the Core validation and handles new use cases. In practice it means that in-use resource tracking is more accurate. If there are any issues with this new implementation please let us now.
Timeline semaphores support for Synchronization validation is not merged yet but there is a branch https://github.com/artem-lunarg/Vulkan-ValidationLayers/tree/artem-timelines-wip12 with all planned functionality implemented, in case you want to have an early preview.
Queue ownership transfer for syncval is still not supported (hopefully after timeline semaphores), so related issues are still there.
@artem-lunarg Thanks for the heads up, this is great news! I'll check out the changes soon, and share some feedback if anything comes up.
Thanks, for syncval timelines we now have a draft PR which will be updated with new fixes: https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/8559
Environment:
d74a5aa16aace35decd6fddd804c201229dcc8dd
(currentmain
)VALIDATION_CHECK_ENABLE_SYNCHRONIZATION_VALIDATION_QUEUE_SUBMIT
Describe the Issue
When
VALIDATION_CHECK_ENABLE_SYNCHRONIZATION_VALIDATION_QUEUE_SUBMIT
is enabled, I'm getting validation errors related to image layout transitions which happen during queue family transfers. The general structure is such (see full code below):vkCmdCopyBufferToImage2
.stageMask = VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT
.stageMask = VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT
This results in the following validation error:
Replacing
VK_PIPELINE_STAGE_2_NONE
withVK_PIPELINE_STAGE_2_COPY_BIT
andVK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT
in the release and acquire barriers respectively removes the error, which would make sense if the semaphore signal and wait stages weren'tALL_COMMANDS
(assuming my understanding ofALL_COMMANDS
is correct). Now, from what I understand,ALL_COMMANDS
should include any operations which read/write memory, including operations which are not covered by any other stage bits (such as event unsignal operations). So my assumption was that both the queue family release/acquire and image layout transition operations were included inALL_COMMANDS
. Indirectly, VVL seems to agree, since a fullALL_COMMANDS, MEMORY_READ|WRITE
barrier is enough to synchronize two layout transition barriers withNONE
in their masks.If this is the case then the error is a false positive, perhaps due to the way they are submitted twice during queue family transfers. On the other hand, if my understanding of
ALL_COMMANDS
is incorrect then there seems to actually be a false negative instead - the queue family release and acquire operations are unsynchronized too, but aren't reported as such.In either case there is also the matter of this message making it seem as though there are two image layout transitions (although I can totally see how this weird edge case is tricky to correctly report).
Additional context
Full code
```c++ void vvl_repro(const VkPhysicalDevice physical_device, const VkDevice device, const std::uint32_t graphics_queue_family, const std::uint32_t transfer_queue_family, const VkQueue graphics_queue, const VkQueue transfer_queue) { VkSemaphore semaphore = nullptr; const VkSemaphoreTypeCreateInfo semaphore_type_info = {.sType = VK_STRUCTURE_TYPE_SEMAPHORE_TYPE_CREATE_INFO, .pNext = nullptr, .semaphoreType = VK_SEMAPHORE_TYPE_TIMELINE, .initialValue = 0}; const VkSemaphoreCreateInfo semaphore_create_info = {.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO, .pNext = &semaphore_type_info, .flags = {}}; vkCreateSemaphore(device, &semaphore_create_info, nullptr, &semaphore); const auto allocate_command_pool_and_buffer = [&](const std::uint32_t queue_family_index) -> std::pair