Hi @siboehm, thank you for this blog post. It's an absolute gem to read through and I gained a tremendous amount of insight. I've been writing cuda for ~6 months now, and wanted to ask how you were able to get to this level? for example, your knowledge of the compiler, warp-level manipulation, and many other things that I thought were NVIDIA trade secrets [e.g., their entire cuBLAS library]
is it a lot of reading the CUDA documentation and simply practicing writing CUDA kernels? it seems near-impossible to catch up to the best kernel engineers today; any insight would be greatly appreciated, thank you!
Hi @siboehm, thank you for this blog post. It's an absolute gem to read through and I gained a tremendous amount of insight. I've been writing cuda for ~6 months now, and wanted to ask how you were able to get to this level? for example, your knowledge of the compiler, warp-level manipulation, and many other things that I thought were NVIDIA trade secrets [e.g., their entire cuBLAS library]
is it a lot of reading the CUDA documentation and simply practicing writing CUDA kernels? it seems near-impossible to catch up to the best kernel engineers today; any insight would be greatly appreciated, thank you!