TiledTensor / TiledCUDA

TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.
MIT License
158 stars 10 forks source link

pass unittest for tensor core gemm. #55

Closed haruhi55 closed 4 months ago

haruhi55 commented 4 months ago

There are still several cases where the tensor core GEMM does not pass the unit tests. The implementation contains some bugs.

https://github.com/TiledTensor/TiledCUDA/blob/471fd6f37a35cd4d99cb84bb5f81976fdd0960ff/tests/cpp/cell/test_gemm.cu#L357-L359