LunarG / VulkanTools

Tools to aid in Vulkan development
Other
662 stars 174 forks source link

`VK_LAYER_LUNARG_api_dump` report parameter values after invoking next layers in stack. #2063

Closed jeroenbakker-atmind closed 3 weeks ago

jeroenbakker-atmind commented 1 month ago

Original issue: https://github.com/KhronosGroup/Vulkan-Tools/issues/1009

For debugging synchronization issues I enable via vkconfig both VK_LAYER_LUNARG_api_dump and VK_LAYER_KHRONOS_validation with the next configuration.

image

Notice I enabled the Output to File->Log Flush After Write and Debug Action->Break options as I want to directly take action and see check stack frames. I don't write to file, but directly to the console.

The issue is that VK_LAYER_LUNARG_api_dump doesn't report the parameter values before the synchronization log even when the layer is "closest to the vulkan application".

vkCmdDrawIndexedIndirect(commandBuffer, buffer, offset, drawCount, stride) returns void:
    commandBuffer:                  VkCommandBuffer = 0x7ff,f76,c23,ac0
    buffer:                         VkBuffer = 0xe,2c7,f10,000,000,76b [DrawCommandBuf_826]
    offset:                         VkDeviceSize = 32
    drawCount:                      uint32_t = 1
    stride:                         uint32_t = 0

vkCmdBeginDebugUtilsLabelEXT(commandBuffer, pLabelInfo) returns void:
    commandBuffer:                  VkCommandBuffer = 0x7ff,f76,c23,ac0
    pLabelInfo:                     const VkDebugUtilsLabelEXT* = 0x7ff,fff,ffd,420:
        sType:                          VkStructureType = VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT (1000128002)
        pNext:                          const void* = NULL
        pLabelName:                     const char* = "StencilClassify"
        color:                          float[4] = 0x7ff,fff,ffd,438
            color[0]:                       float = 0
            color[1]:                       float = 0
            color[2]:                       float = 0
            color[3]:                       float = 0

vkCmdEndRenderingKHR(commandBuffer) returns void:
    commandBuffer:                  VkCommandBuffer = 0x7ff,f76,c23,ac0

vkCmdPipelineBarrier(commandBuffer, srcStageMask, dstStageMask, dependencyFlags, memoryBarrierCount, pMemoryBarriers, bufferMemoryBarrierCount, pBufferMemoryBarriers, imageMemoryBarrierCount, pImageMemoryBarriers) returns voidSYNC-HAZARD-WRITE-AFTER-WRITE(ERROR / SPEC): msgNum: 1544472022 - Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0xdaf60800000006c8, name = GBufferNormal_78, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x5c0ec5d6 | vkCmdPipelineBarrier():  Hazard WRITE_AFTER_WRITE for image barrier 0 VkImage 0xdaf60800000006c8[GBufferNormal_78]. Access info (usage: SYNC_IMAGE_LAYOUT_TRANSITION, prior_usage: SYNC_FRAGMENT_SHADER_SHADER_STORAGE_WRITE, write_barriers: 0, command: vkCmdDrawIndexedIndirect, seq_no: 656, reset_no: 1, debug_region: EEVEE::negZ_view::Deferred.Opaque::Shading::DoubleSided::MADefault Surface::MADefault Surface).
    Objects: 1
        [0] 0xdaf60800000006c8, type: 10, name: GBufferNormal_78

I expected to see the parameters, before the synchronization issue and break signal so there is more information to check. Currently as the debugger is open I need to go back and forth to the stack frame that contains the information.

Expected log:

vkCmdPipelineBarrier(commandBuffer, srcStageMask, dstStageMask, dependencyFlags, memoryBarrierCount, pMemoryBarriers, bufferMemoryBarrierCount, pBufferMemoryBarriers, imageMemoryBarrierCount, pImageMemoryBarriers) returns void:
    commandBuffer:                  VkCommandBuffer = 0x7ff,f76,c23,ac0
    srcStageMask:                   VkPipelineStageFlags = 8192 (VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT)
    dstStageMask:                   VkPipelineStageFlags = 1 (VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT)
    dependencyFlags:                VkDependencyFlags = 1 (VK_DEPENDENCY_BY_REGION_BIT)
    memoryBarrierCount:             uint32_t = 0
    pMemoryBarriers:                const VkMemoryBarrier* = NULL
    bufferMemoryBarrierCount:       uint32_t = 0
    pBufferMemoryBarriers:          const VkBufferMemoryBarrier* = 0x7ff,f76,ac2,818
    imageMemoryBarrierCount:        uint32_t = 2
    pImageMemoryBarriers:           const VkImageMemoryBarrier* = 0x7ff,f76,d27,a18
        pImageMemoryBarriers[0]:        const VkImageMemoryBarrier = 0x7ff,f76,d27,a18:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 480 (VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 480 (VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL (2)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0xd,af6,080,000,000,6c8 [GBufferNormal_78]
            subresourceRange:               VkImageSubresourceRange = 0x7ff,f76,d27,a48:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 4294967295
                baseArrayLayer:                 uint32_t = 1
                layerCount:                     uint32_t = 1
        pImageMemoryBarriers[1]:        const VkImageMemoryBarrier = 0x7ff,f76,d27,a60:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 480 (VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 480 (VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL (2)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0x8,221,6f0,000,000,6c7 [GBufferClosure_77]
            subresourceRange:               VkImageSubresourceRange = 0x7ff,f76,d27,a90:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 4294967295
                baseArrayLayer:                 uint32_t = 2
                layerCount:                     uint32_t = 1

SYNC-HAZARD-WRITE-AFTER-WRITE(ERROR / SPEC): msgNum: 1544472022 - Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0xdaf60800000006c8, name = GBufferNormal_78, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x5c0ec5d6 | vkCmdPipelineBarrier():  Hazard WRITE_AFTER_WRITE for image barrier 0 VkImage 0xdaf60800000006c8[GBufferNormal_78]. Access info (usage: SYNC_IMAGE_LAYOUT_TRANSITION, prior_usage: SYNC_FRAGMENT_SHADER_SHADER_STORAGE_WRITE, write_barriers: 0, command: vkCmdDrawIndexedIndirect, seq_no: 656, reset_no: 1, debug_region: EEVEE::negZ_view::Deferred.Opaque::Shading::DoubleSided::MADefault Surface::MADefault Surface).
    Objects: 1
        [0] 0xdaf60800000006c8, type: 10, name: GBufferNormal_78
jeroenbakker-atmind commented 1 month ago

Response from @charles-lunarg

A couple of things:

1. The repo Api Dump lives in is `LunarG/VulkanTools`, not `KhronosGroup/Vulkan-Tools`. Extremely confusing and easy to get wrong.

2. Api dump doesn't print the parameters before calling down because it has no effective way to know which parameters are "input" and which are "output". Not to mention that the actual order of the parameters is mixed together. The general logic of API dump is "print function header, call down, print parameter values"

3. A couple of solutions

* Add an option to print parameters before calling down. Will print out parameters wrong (ie before they are filled out). And if its an option, it has to be enabled which isn't very helpful if you don't know about the option

* Add a crash handler that writes out the parameters in case of a crash? Maybe API Dump would store the data of the last called command and could read that upon a crash?

* Actually figure out a way to distinguish input and output parameters from vk.xml. That could make API dump a bit smarter and print commands with only "input". It wouldn't do anything for the return value but that would help in this exact case.
jeroenbakker-atmind commented 1 month ago

Agree option 1 is confusing. A crash handler or extracting from vk.xml can work.

I think the extracting from vk.xml would be the best as a break signal can be continued/ignored. So splicing it into input/output parameters makes most sense. But it would change the print order in the log. I can imagine that some tools use the api dump for unit tests validation or any other reason. We don't as we have our own wrapper for it.

charles-lunarg commented 1 month ago

Wait wait wait. returnedonly="true" is an attribute of vk.xml in the structs & unions. From https://github.com/KhronosGroup/Vulkan-Docs/blob/043260d06e96c8586492be41cc109945e92a69ff/registry.adoc#attributes-of-tagtype-tags

returnedonly - only applicable if "category" is struct or union. Notes that this struct/union is going to be filled in by the API, rather than an application filling it out and passing it to the API.

When was this added ????? TWENTY TWENTY????

Well that makes this a LOT easier to implement.

charles-lunarg commented 1 month ago

Okay in my exitement over the vk.xml having the necessary attributes, it wasn't clear that this isn't as simple as I thought. The spec only declares the structs that are "only ever used as outputs" to be returnedOnly="true", which means structs that have dual purpose (like VkPhysicalDeviceFeatures2) to lack the returnedOnly="true".

And since its per-struct and not per-command, the code generator will need to make sure to recursively check structs for their read-only-ness, rather than just checking the top level members of a command.

charles-lunarg commented 1 month ago

A bit of history: The initial version of api dump always printed the parameters after calling down (and coming back up) the chain. This resulted in the 'last command' being lost when a crash occurs. The decision was made to split the printing such that the function header was printed before calling down the chain, so users could at least see the 'last' command called, rather than be shown a command which did succeed but isn't where the crash occurred. This worked by printing the function name, the names of the input parameters (not their values), and the type of the return value. Then after calling down the chain, the return value would be printed if it existed, along with the parameters passed in.

So the main issue I have is that the current format is that only functions which return void can print their parameters before calling down the chain. In the case of vkCmdPipelineBarrier, that is the case, but I'm sure there are plenty of other functions which could have their parameters printed, just not their return values.

To fix the above issue, I believe the output format would need to be altered to put the return value after the parameters. Stashing the parameters so that the current print order would be maintained is no better since its still printing 'after' calling down the chain, where a crash could occur.

Or just accept that some functions which could print parameters before calling down wont and thats that.