issues
search
TiledTensor
/
TiledCUDA
TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.
MIT License
157
stars
10
forks
source link
chore: Minor code refinements.
#148
Closed
haruhi55
closed
1 month ago
haruhi55
commented
1 month ago
Improve thread arrangement within a thread block for better efficiency.
Minor code refinements.