TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.
159
stars
10
forks
source link
Ensure consistency in the use of swizzled shared memory layout #38
Closed
haruhi55 closed 3 months ago