eyalroz / cuda-kat

CUDA kernel author's tools
BSD 3-Clause "New" or "Revised" License
104 stars 8 forks source link

Use iterators for at-grid-stride and at-block-stride traversals #88

Open eyalroz opened 3 years ago

eyalroz commented 3 years ago

Currently, we offer the at_grid_stride(), at_block_stride() and at_warp_stride() functions, which take an invokable and ensure the appropriate traversal pattern is used.

Would it not be a good idea to offer, instead or in addition, iterators corresponding to these patterns, as in range.hpp in the CUDA C++11 sample program?

eyalroz commented 3 years ago

The range in question is Mark Harris' adaptation cpp11range. I find the original to be a bit cluttered with stuff I don't need (e.g. infinite-loop range), and it "conflicts" with C++20 ranges, but it might be adaptable in a more pleasing fashion so that instead of, say:

auto f = [&] (Size pos) {
    foo(pos);
};
kat::linear_grid::collaborative::grid::at_grid_stride(length, f_inner);

we could write:

for(Size pos : kat::ranges::at_grid_stride(length)) {
    foo(pos);
}

as the latter is both shorter (even ignoring namespaces) and simpler, in not requiring a higher-order function.

eyalroz commented 3 years ago

Ok, I won't be going with Mark Harris' code. It's a bit too clunky IMHO; it's not C++17-friendly; and it is saddled with irrelevant baggage from the repository he had modified.

Instead, I'll implement my own integer and strided-integer ranges - which will be constexpr, and host+device; then on top of that I'll add named constructor idioms for warp, block and grid stride iteration.