issues
search
TiledTensor
/
TiledCUDA
TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.
MIT License
148
stars
10
forks
source link
feat(examples): Add a python gemm example.
#135
Closed
haruhi55
closed
2 months ago
haruhi55
commented
2 months ago
Add a Python GEMM example that uses
nvcc
to compile GEMM with tiled CUDA primitives.
Enhance the GEMM example to use multiple blocks for processing larger problem sizes.
nvcc
to compile GEMM with tiled CUDA primitives.