carpentries-incubator / lesson-gpu-programming

GPU Programming with Python and CUDA.
https://carpentries-incubator.github.io/lesson-gpu-programming/
Other
20 stars 12 forks source link

Increasing work per thread/block #60

Open isazi opened 2 years ago

isazi commented 2 years ago

We spend (and I believe rightly so) some time to expand our vector_add example into code that can be run on vectors of arbitrary size. But what if the vectors are so large that having one thread per element is not enough? We need to introduce the concepts of: 1) increasing the amount of work per thread, and 2) increasing the amount of work per block.