KhronosGroup / Vulkan-Docs

The Vulkan API Specification and related tools
Other
2.83k stars 471 forks source link

[wiki]Extraneous stage and access flags in depth clear\reuse example #2055

Open krOoze opened 1 year ago

krOoze commented 1 year ago

There are extraneous stage and access bits in the following depth clear\reuse example

https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchronization-Examples#first-render-pass-writes-to-a-depth-attachment-second-render-pass-re-uses-the-same-depth-attachment

VkSubpassDependency dependency = {
   .srcSubpass = VK_SUBPASS_EXTERNAL,
   .dstSubpass = 0,
   .srcStageMask = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT |
                   VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT, // Both stages might have access the depth-buffer, so need both in src/dstStageMask
   .dstStageMask = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT |
                   VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT,
   .srcAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT,
   .dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT
   .dependencyFlags = 0}; 

Load op VK_ATTACHMENT_LOAD_OP_CLEAR for depth are performed in VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT with VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT. Therefore dst of VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT and VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT should be redundant.

HansKristian-Work commented 1 year ago

Seems like the spec changed at some point here, so to synchronize the loadOp = CLEAR it makes sense to only have EARLY_FRAGMENT_TESTS. But fragments later in the render pass might end up running in EARLY or LATE and it may read or write, so I think we technically need to be conservative here.

krOoze commented 1 year ago

@HansKristian-Work Not sure that is healthy to be "conservative" with something that is supposed to be formally described exactly. If that is so, there should be a red triangle with three exclamation marks with a note explaining why it is done in opposition of what the spec implies.

Even so, the CLEAR modifies the memory, therefore there is no producer-consumer relationship between the previous work and the "fragments later in the render pass", therefore the extraneous access mask would not even make much sense semantically. All needs to depend directly on the CLEAR load-op and nothing else.

TomOlson commented 1 year ago

The WG discussed on 2023-02-22. @krOoze's analysis is correct, but we're going to leave the example code as is. We expect that many devs will copy-paste the code into an app and then replace/follow the CLEAR with other commands, which is the case @HansKristian-Work was thinking about. We want the example code to be robust to that.

krOoze commented 1 year ago

@TomOlson Following the CLEAR with other commands does not change the situation at all, as elaborated in the previous comment.

Replacing the CLEAR does not change the pipeline stage. Load op is always (only) VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, and store op is always (only) VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT. The comment in the code is a lie.

Only time this might make sense is dependencies between subpasses.

Tobski commented 1 year ago

Ok, so this section of the spec got tightened up a lot since the examples were written, and I'm inclined to agree with @krOoze here. The only time being conservative would make sense is if you used the NONE ops (i.e. there is no load/store), but those are primarily intended for cases when you aren't going to access the data, so you wouldn't need a barrier 99% of the time anyway.

So I am going to update the example just so it's clear that yea, you can rely on the subset of pipeline stages.

Tobski commented 1 year ago

Fixed per above. Assuming that satisfies you @krOoze I'm closing this issue, feel free to reopen if there's any further issues...

krOoze commented 1 year ago

@Tobski Well, the access masks are still extraneous. If we really want it to be robust against load\store op change, I would prefer this nuance to be explicitly stated in the code, such as:


VkAccessFlags dstAccessFlags;
if( depthFramebufferAttachment.loadOp == VK_ATTACHMENT_LOAD_OP_LOAD )
    dstAccessFlags = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT;
else if( depthFramebufferAttachment.loadOp == VK_ATTACHMENT_LOAD_OP_CLEAR
      || depthFramebufferAttachment.loadOp == VK_ATTACHMENT_LOAD_OP_DONT_CARE )
    dstAccessFlags = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
else panic();
Tobski commented 1 year ago

We're currently discussing whether write flags are required at all in destination access flags because it seems there's been some confusion; until that's resolved I don't want to touch this, because the resolution there will affect the point you're making, and whether it needs addressing at all; but yea it probably should be a little more explicit about handling the different load ops.