DTolm / VkFFT

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
MIT License
1.48k stars 88 forks source link

Support VK_KHR_push_descriptor to allow multiple invocations with different buffers in a single command buffer #155

Open azonenberg opened 4 months ago

azonenberg commented 4 months ago

It appears that vkFFT tries to update and then bind a single descriptor set each time you call vkfftAppend().

This causes problems if you want to do multiple FFTs in a single command buffer with separate input and output buffers. If VK_KHR_push_descriptor is available, using it would allow arbitrarily changing buffers each time vkfftAppend is called.

DTolm commented 4 months ago

Hm, the VK_KHR_push_descriptor seems not to be the core extension yet and not supported on some platforms. Wouldn't one of the following solutions be better:

-Recreate the descriptor set on every VkFFTAppend call if isInputFormatted (or other type of runtime buffer selection) is selected. -Allocate multiple descriptor sets (number selected by user) and cycle through them at every append call.

azonenberg commented 4 months ago

1) Not possible. Once the descriptor is bound in a command buffer, you're not allowed to modify or destroy it until the command buffer finishes execution. Avoiding this limitation is the whole point of VK_KHR_push_descriptor.

2) Possibly doable but resource intensive. And I'm not sure if a single compute pipeline is allowed to have multiple descriptor sets bound in the same command buffer (never tried).

In my application I'm using this as part of a larger code path that FFT's two input signals, does a bunch of channel emulation and de-embedding operations in the frequency domain, then IFFT's both of them. This entire optimized code path runs in a single command buffer and already relies on VK_KHR_push_descriptor support in order to allow changing buffers for the de-embedding shader within the command buffer. If it's not available I'd need to do a separate command buffer submission and block after each shader invocation anyway.

My recommendation is just to provide a flag you can set in the plan (or automatically enable if you detect VK_KHR_push_descriptor at plan creation time) that uses push descriptors instead of static binding. The user can then check for VK_KHR_push_descriptor support and, if available, elect to call multiple vkfftAppend calls in one command buffer; if not available then it's on them to not use any given plan a second time before the first use has finished.

You can see some example of how I do this in my code here:

https://github.com/ngscopeclient/scopehal/blob/master/scopehal/ComputePipeline.h?ts=4#L65

https://github.com/ngscopeclient/scopehal/blob/master/scopehal/ComputePipeline.h?ts=4#L178

https://github.com/ngscopeclient/scopehal/blob/master/scopehal/ComputePipeline.h?ts=4#L189

https://github.com/ngscopeclient/scopehal/blob/master/scopehal/ComputePipeline.h?ts=4#L230

DTolm commented 4 months ago

I was mostly referring to this answer as to why explore alternative options: https://community.amd.com/t5/opengl-vulkan/vulkan-vk-khr-push-descriptor-extension-on-windows/td-p/509394

I have enabled push descriptors according to your recommendations. There is still a validation warning that "vkCreateDevice: pCreateInfo->pNext chain includes a structure with unexpected VkStructureType VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PUSH_DESCRIPTOR_PROPERTIES_KHR", which I need to look into. Otherwise, the issue with incorrect descriptor sets update now doesn't trigger the validation layer compared to before and results seem to be correct. Thank you for pointing this out, if something doesn't work I will be happy to fix it.