NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.31k stars 166 forks source link

[FEA]: `cuda::span_collection` #2938

Open davebayer opened 4 days ago

davebayer commented 4 days ago

Is this a duplicate?

Area

CUDA Experimental (cudax)

Is your feature request related to a problem? Please describe.

It is a common case that you need to pass views over multiple arrays to a function or kernel that have the exact same dimensions. It leads to code duplication and passing more arguments than necessary.

Describe the solution you'd like

I propose to implement a storage optimized container of views cuda::span_collection that could be used like this:

template<cuda::std::size_t extent>
void vector_add(cuda::span_collection<extent, const int, const int, int> params)
{
    for (auto [a, b, c] : params)
    {
        c = a + b;
    }
}

int main()
{
    static constexpr cuda::std::size_t size = 8;

    cuda::std::array<const int, size> a{1, 2, 3, 4, 5, 6, 7, 8};
    cuda::std::array<const int, size> b{1, 2, 3, 4, 5, 6, 7, 8};
    cuda::std::array<int, size>       c{};

    vector_add({size, a, b, c});
}

I tried to play with it a bit here https://godbolt.org/z/9WWhjKxfq

What are your thoughts on this idea?

Describe alternatives you've considered

No response

Additional context

No response

bernhardmgruber commented 4 days ago

I believe we have something similar with thrust::zip_iterator, although we don't have a corresponding range type for it. C++23 has std::views::zip which should serve this purpose, and @miscco has somewhat of a ranges implementation here: #198.

Therefore, I don't see the necessity of introducing this feature, unless I misunderstood it.