KhronosGroup / Vulkan-Samples

One stop solution for all Vulkan samples
Apache License 2.0
4.34k stars 648 forks source link

Use staging buffers for vertex attributes and indices in [HPP]GLTFLoader #862

Open asuessenbach opened 11 months ago

asuessenbach commented 11 months ago

Currently, the vertex attributes and indices loaded via load_scene are copied into buffers created with the VMA_MEMORY_USAGE_CPU_TO_GPU-flag. Using dedicated staging buffers to have that data in buffers with the VMA_MEMORY_USAGE_GPU_ONLY-flag is supposed to be more efficient. Images loaded via load_scene and vertices and indices loaded via load_model are already using that approach.

Note: all the VMA_MEMORY_USAGE* flags we're using are marked as obsolete! Should we adjust our usage according to https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/usage_patterns.html? Or should we continue with our current usage? Or should we introduce some simple DeviceMemoryManager that would explicitly use the VkMemoryPropertyFlagBits (or vk::MemoryPropertyFlags).

jherico commented 9 months ago

The version of the VMA in use is fairly out of date. It appears to be a development version of 3.0.0, but from 2020, while 3.0.0 itself wasn't released until 2022, and 3.0.1 shortly after that.

I'm working on a PR that will update to the 3.0.1 release tag, and will default all parameters of type VmaMemoryUsage to VMA_MEMORY_USAGE_AUTO. This should cause the VMA to automatically infer what kind of memory it should allocate from a combination of the buffer/image usage flags and the VmaAllocationCreateFlags. Staging buffers created should be created with VMA_ALLOCATION_CREATE_MAPPED_BIT | VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT and if the destination buffer is created with VMA_MEMORY_USAGE_AUTO and doesn't include a mapping bit it should automatically get memory from a device-only pool.

jherico commented 9 months ago

Actually, I suspect part of the problem is that both HPPBuffer and Buffer automatically default to VMA_ALLOCATION_CREATE_MAPPED_BIT, which typically forces host memory.

Unfortunately this means this PR will be more complicated than I thought, because I can't make it a drop in replacement if that parameters is changing from VMA_ALLOCATION_CREATE_MAPPED_BIT to 0.