vblanco20-1 / vkguide-comments

Storage for the comments of vulkan-guide
0 stars 0 forks source link

Chapter 4 Comments #7

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

Double buffering - Vulkan Guide

Practical guide to vulkan graphics programming

https://www.vkguide.dev/docs/chapter-4/double_buffering/

evyatark2 commented 3 years ago

If I understand this correctly, both frames can read and write to the depth buffer during STAGE_LATE_FRAGMENT_TESTS_BIT and STAGE_EARLY_FRAGMENT_TESTS_BIT. So to make this tutorial correct one of two things needs to happen:

  1. Depth buffer per frame (so FRAME_OVERLAP depth buffers).
  2. Change subpass dependency between VK_SUBPASS_EXTERNAL to subpass #0 to be
    dependency.srcStageMask    = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    dependency.dstStageMask    = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT | VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    dependency.srcAccessMask   = 0,
    dependency.dstAccessMask   = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,

Please correct me if I am wrong about this.

vblanco20-1 commented 3 years ago

@evyatark2 the frames are synced so it should be fine. The code of the tutorial passes all validation, even syncronization validation, and its how most people do it.

evyatark2 commented 3 years ago

@vblanco20-1 I get this error when turning on synchronization validation in vkconfig:

Validation Error: [ SYNC-HAZARD-WRITE_AFTER_WRITE ] Object 0: handle = 0x5567fef4af60, type = VK_OBJECT_TYPE_RENDER_PASS; | MessageID = 0xfdf9f5e1 | vkCmdBeginRenderPass: Hazard WRITE_AFTER_WRITE vs. layout transition in subpass 0 for attachment 1 aspect depth during load with loadOp VK_ATTACHMENT_LOAD_OP_CLEAR. (Arch Linux, AMD RX 5700X, AMDGPU driver)

The program works fine and I'm sure it will work fine on almost all platforms because as you said, most people do it that way but I still think that it is incorrect to have a single depth buffer for all frames and all swapchain images. I fail to see where is the synchronization of the depth buffer is between the frames. In Khronos' example they use a depth buffer per swapchain image.

David-DiGioia commented 3 years ago

Why do we put the camera/scene uniform and object matrices SSBO in separate descriptor sets since they both have the lifetime of a single frame? Don't you normally group things into descriptor sets by lifetime?

vblanco20-1 commented 3 years ago

@David-DiGioia Its done to show how to use multiple descriptor sets like that. In a more advanced engine, the Camera and Scene uniforms will have extras, such as environment textures or other global things, and you have more than 1 object-data SSBO from multiple mesh passes. This can be seen quite well on the GPU Driven Rendering chapter, where each of the mesh passes has its own ssbos to bind.

warriormaster12 commented 3 years ago

Why doesn't this chapter cover how to cleanup descriptor sets before closing the application ?

3PIV commented 3 years ago

In regards to double buffering when it came to destroying the vkObjects I had created, I was doing something like:

  vkWaitForFences(device_, 1, &getCurrentFrameData().renderFence, true, 1000000000);

I noticed I was getting all sorts of validation errors. When I thought about the process, I noticed I was incrementing the frameCount every draw call. So, when a render would complete the frameCount would increase. This meant the call:

  vkWaitForFences(device_, 1, &getCurrentFrameData().renderFence, true, 1000000000);

Was waiting the wrong fence!

I could fix it by simply doing:

  --frameCount;
  vkWaitForFences(device_, 1, &getCurrentFrameData().renderFence, true, 1000000000);
  ++frameCount;

So, if anyone is hitting validation errors after double buffering - this could be a potential issue. I am currently looking into a more robust solution.

supuo commented 3 years ago

Can't understand the advantage of dynamic uniform buffer compared with the former method in this page. The only difference I notice is the position to specify the "offset", one in VkDescriptorBufferInfo structure, the other in vkCmdBindDescriptorSets function. Is there an example of "constantly reallocate descriptor sets"?

dortamiguel commented 2 years ago

It seems like it renders at 120fps while my monitor it's running at 60hz

If I turn my monitor to use 30hz then vulkan renders at 60fps

I'm doing something wrong?

Lucodivo commented 2 years ago

When creating a pipeline, you have to specify the layouts for each of the descriptor sets that can be bound to the pipeline. This is commonly done automatically, generated from reflection on the shader.

Any resources for this? Unfortunately searching "reflection" plus any graphics terms isn't going to net a lot of positive results for what I'd like to find. 😅

EDIT: One answer for reflection on shaders is using Khronos Group's SPIRV-Reflect

K1ngst0m commented 2 years ago

In order to use gl_BaseInstance, you need to enable shader draw parameters feature in your device code, otherwise, you will get a validation error [ VUID-VkShaderModuleCreateInfo-pCode-01091 ].

braxtonbarger commented 2 years ago

@3PIV You can wait on both fences. Something like

VkFence allFences[FRAME_OVERLAP];
for (int i = 0; i < FRAME_OVERLAP; ++i) { allFences[i] = m_frames[i].m_renderFence; }
vkWaitForFences(m_vkDevice, FRAME_OVERLAP, allFences, true, 1000000000);
Ribosome2 commented 2 years ago

great guides ! I learned a lot from this , Thanks

CarloWood commented 2 years ago
Allocating descriptor sets can be very cheap if you explicitly disallow freeing individual sets by setting the VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT flag. By using that flag, the driver might make the descriptor pool just use a very cheap linear allocator. If you are allocating descriptor sets per frame, you should be using that, and then you reset the entire pool instead of individual descriptor sets.

This paragraph is wrong. Setting VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT does allow freeing individual descriptor sets. Hence the above should read:

Allocating descriptor sets can be very cheap if you explicitly disallow freeing individual sets by NOT setting the VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT flag. By NOT using that flag, the driver might make the descriptor pool just use a very cheap linear allocator. If you are allocating descriptor sets per frame, you should be doing that, and then reset the entire pool instead of individual descriptor sets.
jonathan-hopkins commented 2 years ago

In Dynamic Descriptor sets, "Setting up Scene Data" you specify VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC in your example before you have introduced the concept of dynamic buffers. This should be VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER so that you can run correctly at the end of the "New shaders" stage.

pboechat commented 1 year ago

@Lucodivo A bit late but just wanted to say that there's also SPIRV-Cross.

yelibumeng commented 1 year ago

Validation Error: [ VUID-VkGraphicsPipelineCreateInfo-layout-00756 ] Object 0: handle = 0x7faa6f032a18, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x45717876 | Shader uses descriptor slot 0.1 (expected VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC, VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK) but not declared in pipeline layout The Vulkan spec states: layout must be consistent with all shaders specified in pStages (https://vulkan.lunarg.com/doc/view/1.3.216.0/mac/1.3-extensions/vkspec.html#VUID-VkGraphicsPipelineCreateInfo-layout-00756) hello i get this error Although the program can run normally,help !!!! @vblanco20-1

SahilMadan commented 1 year ago

If, after "Setting up descriptor sets", you notice that the program is crashing when closing, make sure to capture variable i in the lambda when destroying the buffer

for (int i = 0; i < FRAME_OVERLAP; i++) {
    _mainDeletionQueue.push_function([&, i]() {
        vmaDestroyBuffer(_allocator, _frames[i].cameraBuffer._buffer, _frames[i].cameraBuffer._allocation);
    }
}
andewx commented 1 year ago

So I'm assuming that the storage buffer alignment requirements are not the same as the dynamic descriptor requirements. My system uses a 256 byte alignment for example. So if I was for example using a compute shader for positional data (12 bytes/16 bytes std140), then I'm going to assume that I'm not required to align each individual position to 256 for the vulkan uniform specification and that the shader implementation will be able to derive the correct offsets.

This is the result I'm hoping for but if anyone has a concrete answer please let me know (or I'm just assuming I'll have this implemented by then). In any case probably should mention the nature of this distinction on the page.

alecazam commented 1 year ago

How do you detect Mali not supporting SBO in vertex shaders by reporting 0. In GL this is done by detecting GL_MAX_VERTEX_SHADER_STORAGE_BLOCKS. Then you have to fallback to a UBO or to using instancing data instead. Seems like a big flaw in Android Vulkan to allow this to have been left out.

alecazam commented 1 year ago

So Vulkan Mali supports readonly SBO from the VS but not RW SBO. GL can't create readonly SBO, so that BLOCK count above is 0. Here's the flag to detect if RW SBO are supported in Vulkan. This will be false on Mali.

features.vertexPipelineStoresAndAtomics

dogukannn commented 1 year ago

If I understand this correctly, both frames can read and write to the depth buffer during STAGE_LATE_FRAGMENT_TESTS_BIT and STAGE_EARLY_FRAGMENT_TESTS_BIT. So to make this tutorial correct one of two things needs to happen:

  1. Depth buffer per frame (so FRAME_OVERLAP depth buffers).
  2. Change subpass dependency between VK_SUBPASS_EXTERNAL to subpass #0 to be
    dependency.srcStageMask    = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    dependency.dstStageMask    = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT | VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    dependency.srcAccessMask   = 0,
    dependency.dstAccessMask   = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,

Please correct me if I am wrong about this.

I think we do not have a problem here because GPU only renders one frame at a time, we use the same color attachment and depth buffer to render different frames. The in-flight frames are only there to reduce the wait on the cpu side. GPU still waits the frame before to start the render.

Trider12 commented 1 year ago

For anyone wondering

alignedSize = (alignedSize + minUboAlignment - 1) & ~(minUboAlignment - 1);

is the same as

alignedSize = (alignedSize + minUboAlignment - 1) / minUboAlignment * minUboAlignment;
Tnkln99 commented 1 year ago

Is it makeing the rendering slower the fact that every frame we are writing every renderable to object buffer and then again doing another loop to binding the descriptor and calling the draw call? Thinking about it, the complexity is still O(n) but is this a comman way to send modelmatrices?

matusfedorko commented 5 months ago

In section Descriptor Set Layout there is this sentence: "Descriptor set layouts can be compatible if they are the same even if they are created in two different places.". What does it mean ? That two descriptor set layouts can be compatible and they don't need to be the same VkDescriptorSetLayout handle ?