KhronosGroup / Vulkan-ValidationLayers

Vulkan Validation Layers (VVL)
https://vulkan.lunarg.com/doc/sdk/latest/linux/khronos_validation_layer.html
Other
731 stars 397 forks source link

Buffer content validation disturbs push constants set by application #8120

Closed pc-john closed 1 month ago

pc-john commented 1 month ago

Environment:

Short description

My application does not render anything when I enable "Buffer content validation" and "Indirect draws parameters". It seems that push constants are disturbed by VVL and not restored properly after performing validation.

Long description

My application stops rendering when I enable "Buffer content validation" and "Indirect draws parameters". If I enable just ApiDump, the application works correctly and I see following Vulkan calls:

vkCmdBindPipeline(...)
vkCmdPushConstants(...)
vkCmdDrawIndirect(...)

If I enable validation but not "Indirect draws parameters", I see the following:

vkCmdBindPipeline(...)
vkCmdPushConstants(...)
vkAllocateDescriptorSets(...)   // <- probably inserted by VVL
vkUpdateDescriptorSets(...)   // <- probably inserted by VVL
vkCmdBindDescriptorSets(...)   // <- probably inserted by VVL
vkCmdDrawIndirect(...)

My application still works. But if I enable even "Indirect draws parameters", I see the following:

vkCmdBindPipeline(...)
vkCmdPushConstants(...)
vkCreateDescriptorSetLayout(...)   // <- probably inserted by VVL
vkCreatePipelineLayout(...)   // <- probably inserted by VVL
vkCreateShaderModule(...)   // <- probably inserted by VVL
vkCreateGraphicsPipelines(...)   // <- probably inserted by VVL
vkAllocateDescriptorSets(...)   // <- probably inserted by VVL
vkUpdateDescriptorSets(...)   // <- probably inserted by VVL
vkCmdBindPipeline(...)   // <- probably inserted by VVL
vkCmdPushConstants(...)   // <- probably inserted by VVL
vkCmdBindDescriptorSets(...)   // <- probably inserted by VVL
vkCmdBindDescriptorSets(...)   // <- probably inserted by VVL
vkCmdDraw(...)   // <- probably inserted by VVL
vkAllocateDescriptorSets(...)   // <- probably inserted by VVL
vkUpdateDescriptorSets(...)   // <- probably inserted by VVL
vkCmdBindDescriptorSets(...)   // <- probably inserted by VVL
vkCmdBindPipeline(...)   // <- probably inserted by VVL
vkCmdPushConstants(...)   // <- probably inserted by VVL
vkCmdPushConstants(...)   // <- probably inserted by VVL
vkCmdDrawIndirect(...)

The application leaves black screen. It seems to me that VVL performs some magic that involves creation of its own pipeline, binding it and binding new descriptor sets and setting new push constants. After that, VVL renders single triangle by vkCmdDraw(...). In the following code, I expect, it tries to restore original Vulkan state, e.g. it binds again descriptor sets and restores push constants state. As the last point, it finally makes my vkCmdDrawIndirect() call.

Unfortunately, my vkCmdDrawIndirect() call does not render anything. Difficult to debug it because I cannot use printf together with "Indirect draws parameters" validation. vkconfig does not allow such option. Because my shaders do not use any descriptor sets, only buffer_references and push constants, my bet would be that push constants are not restored correctly.

This is how VVL restores push constants before my vkCmdDrawIndirect():

Thread 0, Frame 0:
vkCmdPushConstants(commandBuffer, layout, stageFlags, offset, size, pValues) returns void:
   commandBuffer:                  VkCommandBuffer = 000002AFCE5B8070
   layout:                         VkPipelineLayout = 0000000000000000
   stageFlags:                     VkShaderStageFlags = 1 (VK_SHADER_STAGE_VERTEX_BIT)
   offset:                         uint32_t = 0
   size:                           uint32_t = 16
   pValues:                        const void* = 000005AA547F2D60

Thread 0, Frame 0:
vkCmdPushConstants(commandBuffer, layout, stageFlags, offset, size, pValues) returns void:
   commandBuffer:                  VkCommandBuffer = 000002AFCE5B8070
   layout:                         VkPipelineLayout = 0000000000000000
   stageFlags:                     VkShaderStageFlags = 16 (VK_SHADER_STAGE_FRAGMENT_BIT)
   offset:                         uint32_t = 8
   size:                           uint32_t = 12
   pValues:                        const void* = 000005AA547F2D60

And this is how I set them myself a little earlier in the application:

vkCmdPushConstants(commandBuffer, layout, stageFlags, offset, size, pValues) returns void:
   commandBuffer:                  VkCommandBuffer = 000002AFCE5B8070
   layout:                         VkPipelineLayout = 000002AFCE6A2430
   stageFlags:                     VkShaderStageFlags = 1 (VK_SHADER_STAGE_VERTEX_BIT)
   offset:                         uint32_t = 0
   size:                           uint32_t = 8
   pValues:                        const void* = 000000D39A4FB168

[...]

vkCmdPushConstants(commandBuffer, layout, stageFlags, offset, size, pValues) returns void:
   commandBuffer:                  VkCommandBuffer = 000002AFCE5B8070
   layout:                         VkPipelineLayout = 000002AFCE6A2430
   stageFlags:                     VkShaderStageFlags = 17 (VK_SHADER_STAGE_VERTEX_BIT | VK_SHADER_STAGE_FRAGMENT_BIT)
   offset:                         uint32_t = 8
   size:                           uint32_t = 8
   pValues:                        const void* = 000000D39A4FB170

[...]

vkCmdPushConstants(commandBuffer, layout, stageFlags, offset, size, pValues) returns void:
   commandBuffer:                  VkCommandBuffer = 000002AFCE5B8070
   layout:                         VkPipelineLayout = 000002AFCE6A2430
   stageFlags:                     VkShaderStageFlags = 1 (VK_SHADER_STAGE_VERTEX_BIT)
   offset:                         uint32_t = 0
   size:                           uint32_t = 8
   pValues:                        const void* = 000000D39A4FAA60

When I moved fragment shader values to not overlap with vertex shader values, the situation considerably improved. At least vertex shader seems to get correct push constants while fragment shader does not.

spencer-lunarg commented 1 month ago

It seems to me that VVL performs some magic that involves creation of its own pipeline, binding it and binding new descriptor sets and setting new push constants.

so yes, when we add our instrumented things for GPU-AV we take a descriptor set and store data there, but we have used the push constants to bring data in that we need to check against... I know push constant can be disturbed and hopefully this is just a bug of not resetting the original push constant values

@arno-lunarg a good test would be to use push constants to write data to a storage buffer and insert the GPU-AV before it and then to a ASSERT_EQ we wrote the values out correctly still (these are the harder tests to write :smile: )

arno-lunarg commented 1 month ago

Hello @pc-john ! I have an attempt fix that I merged in main. It would be really nice if you could test it locally, either by building VVL yourself, or taking the .dll that is built as part of Github CI artefacts (I will update this message with a link to the artefacts when upload is done)

EDIT: Windows artefacts: https://github.com/KhronosGroup/Vulkan-ValidationLayers/actions/runs/9551515108/artifacts/1609547959

pc-john commented 1 month ago

Hello @arno-lunarg ! Thanks for the fix! Unfortunately, I am on holidays until 6th of July. I will try to test it then and will be back with the results.

pc-john commented 1 week ago

Hello @arno-lunarg ! I am back from holidays. I tested VVL with your .dll from CI artefact and I can confirm that the problem completely disappeared! Thanks for the fix!

arno-lunarg commented 1 week ago

Nice, thank you for reporting the issue !