siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch
https://siboehm.com/articles/22/CUDA-MMM
MIT License
410 stars 53 forks source link