TiledTensor / TiledCUDA

TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.
MIT License
159 stars 10 forks source link

feat(unittest): Implement basic unittest for transferring 2D data tiles between global and shared memory #24

Closed haruhi55 closed 7 months ago

haruhi55 commented 7 months ago

This PR actually did several things:

  1. resolve https://github.com/TiledTensor/TiledCUDA/issues/23
  2. I would prefer to incorporate glog into the project. However, this introduces a seemingly heavy dependence, as glog requires a build. Currently, the cmake script merely attempts to find a locally installed glog. If glog is not found, the build process will fail.
  3. Add a basic unittest for data transfer between global and shared memory. This exercise is to go through and examine the structure of the codes.