With the Hopper architecture, NVIDIA has introduced "clusters" of blocks which can use each other's shared memory. The clustering can be set either using a __cluster_dims__(1,2,3) qualifier in the kernel's signature, or at run-time. We need to support the run-time setting within our launch_configuration_t class and in the launch config builder mechanism.
With the Hopper architecture, NVIDIA has introduced "clusters" of blocks which can use each other's shared memory. The clustering can be set either using a
__cluster_dims__(1,2,3)
qualifier in the kernel's signature, or at run-time. We need to support the run-time setting within ourlaunch_configuration_t
class and in the launch config builder mechanism.