Open kpet opened 2 years ago
Seems reasonable overall. A few points to consider:
Thanks for the feedback!
FYI Nvidia originally had a limit on CUDA parameters of 256 bytes. This limit was increased to 4k in cuda 2.x by moving to constant memory AKA uniform memory/buffers.
I mention this to show there's industry precedent for the exact solution proposed for the push constant size limitations.
Motivation
NULL
kernel arguments).Proposed design outline
Here's a rough outline of what I prototyped:
VK_KHR_buffer_device_address
via either push constants or uniform buffers. The interface uses a structure of 64-bit integers (or 2-element vectors of 32-bit integers), one for each pointer argument, that is part of the global push constant structure or stored in a dedicated uniform buffer. Pointer arguments to kernel functions are rewritten to load the required integers from this structure and convert them to pointers. A new pass is introduced to declare the interface structure and perform the necessary rewriting.Proposed staging