NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.28k stars 163 forks source link

[FEA]: Add tests that verify Thrust/CUB algorithms satisfy necessary requirements about how many times they access inputs/outputs #1662

Open jrhemstad opened 6 months ago

jrhemstad commented 6 months ago

Is this a duplicate?

Area

General CCCL

Is your feature request related to a problem? Please describe.

As a user of Thrust/CUB algorithms, I want to be sure that they satisfy any explicit/implicit requirements about how many times elements in the input/output range are dereferenced.

For example, for_each must guarantee that each input is only dereferenced exactly once to allow the operator to have side-effects. We don't actually test that today.

Describe the solution you'd like

We should add testing utilities that verify input/output iterators are dereferenced only once and use those in the appropriate algorithms (e.g., for_each).

Describe alternatives you've considered

No response

Additional context

Today, for_each is the only algorithm where we fundamentally require the input elements are only dereferenced once. However, there is interest in potentially expanding this to other algorithms as needed or as possible.

jrhemstad commented 6 months ago

Related to https://github.com/NVIDIA/cccl/issues/1661

jrhemstad commented 6 months ago

I'm thinking we could create a "SingleAccessIterator" adaptor type something like this. We could kill two birds with one stone and also make it do bounds checking.

template<typename Iterator>
class SingleAccessIterator {
public:
    using iterator_category = std::random_access_iterator_tag;
    using value_type        = typename std::iterator_traits<Iterator>::value_type;
    using difference_type   = typename std::iterator_traits<Iterator>::difference_type;
    using pointer           = typename std::iterator_traits<Iterator>::pointer;
    using reference         = typename std::iterator_traits<Iterator>::reference;

private:
    Iterator base_iterator;
    thrust::device_vector<cuda::std::atomic_flag> accessed;
    size_t index{};

public:
    reference operator*() {
        if (index >= accessed.size() || accessed[index].test_and_set()) {
            CCCL_FAIL("Element accessed more than once or out of bounds access");
        }
        return *base_iterator;
    }
...
jrhemstad commented 6 months ago

@gevtushenko it occurs to me that we could make this an intrinsic part of the c2h custom type input generator? Something like:

using type = c2h::custom_type_t<c2h::accumulateable_t,
                                c2h::equal_comparable_t, 
                                c2h::single_access_t,
                                c2h::bounds_check_t>;