siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch
https://siboehm.com/articles/22/CUDA-MMM
MIT License
464 stars 62 forks source link

advice for getting to this level of expertise at writing CUDA #11

Open j93hahn opened 3 months ago

j93hahn commented 3 months ago

Hi @siboehm, thank you for this blog post. It's an absolute gem to read through and I gained a tremendous amount of insight. I've been writing cuda for ~6 months now, and wanted to ask how you were able to get to this level? for example, your knowledge of the compiler, warp-level manipulation, and many other things that I thought were NVIDIA trade secrets [e.g., their entire cuBLAS library]

is it a lot of reading the CUDA documentation and simply practicing writing CUDA kernels? it seems near-impossible to catch up to the best kernel engineers today; any insight would be greatly appreciated, thank you!