nvpro-samples / vk_raytracing_tutorial_KHR

Ray tracing examples and tutorials using VK_KHR_ray_tracing
Apache License 2.0
1.34k stars 142 forks source link

VK_BUFFER_USAGE_STORAGE_BUFFER_BIT for vertices and indices buffer #58

Closed FuXiii closed 1 year ago

FuXiii commented 1 year ago

In 6.1 Additions to the Scene Descriptor Set :

Originally the buffers containing the vertices and indices were only used by the rasterization pipeline. The ray tracing will need to use those buffers as storage buffers, so we add VK_BUFFER_USAGE_STORAGE_BUFFER_BIT.

Why we need add VK_BUFFER_USAGE_STORAGE_BUFFER_BIT when create vertices buffer and indices buffer? In most cases, the vertices and indices data will not change. Use VK_BUFFER_USAGE_STORAGE_BUFFER_BIT usually want save data from device, which data should save from device into vertices buffer and indices buffer?

NBickford-NV commented 1 year ago

Hi FuXiii! Is your first question about why we use VK_BUFFER_USAGE_STORAGE_BUFFER_BIT and not e.g. VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT?

One reason we don't use VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT here is because roughly 60% of devices measured by GPUInfo have a maximum uniform buffer size of 65536 bytes, which would significantly limit the size of the meshes we could represent: https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxUniformBufferRange&platform=all. Compared to that, the maximum storage buffer size is 128 MiB or larger on almost all devices: https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxStorageBufferRange&platform=all .

Additionally, if uniform buffers aren't uniformly accessed (i.e. each thread may divergently read a different element, as is the case for vertex and index buffers), uniform buffers can be slower than storage buffers (although this depends on the hardware). Sebastian Aaltonen has made some DirectX benchmarks at https://github.com/sebbbi/perftest; you can compare ByteAddressBuffer<float4>.Load random against cbuffer{float4} load random lines to get a rough idea.

Since our global uniform data (i.e. camera matrices) is small, fixed-size, and accessed uniformly, a uniform buffer works well for eGlobals.

Note that Vulkan introduced VK_KHR_ray_tracing_position_fetch in March 2023, which allows you to fetch vertex positions directly from a ray hit! However, if you have per-vertex shading attributes such as normals or texture coordinates, you'll still want to store and load those from storage buffers, as those aren't stored in the acceleration structure.

Hopefully this also answers your second question: storage buffers are useful for more than storing data written by the GPU; they're also a good fit for buffers that are greater than 64 KiB in size or that will be divergently read.

FuXiii commented 1 year ago

Ok, thank you very much! (๑•̀ㅂ•́)و✧