Atomic Refs Bound to Memory Orderings & Atomic Accessors

The mdspan paper P0009 listed atomic accessors as a reason for having accessors in the first place.

One of the use cases is for parallel algorithms updating data in a way which has data races.

Consider the histogram computation:

template<class ExecT>
void compute_histogram(ExecT exec, float bin_size,
               std::mdspan<int, std::dextents<size_t,1>> output,
               std::mdspan<float, std::dextents<size_t,1>> data) {
  static_assert(std::is_execution_policy_v<ExecT>);

  std::for_each(exec, data.data_handle(), data.data_handle()+data.extent(0), [=](float val) {
    int bin = std::abs(val)/bin_size;
    bin = std::clamp(bin, size_t(0), output.extent(0));
    output[bin]++;
  });

Depending on whether ExecT is sequenced_policy or not, the update needs to happen atomically.

With just atomic_ref one could do:

  std::for_each(exec, data.data_handle(), data.data_handle()+data.extent(0), [=](float val) {
    int bin = std::abs(val)/bin_size;
    bin = std::clamp(bin, size_t(0), output.extent(0));
    atomic_ref(output[bin])++;
  });

This paper proposes a way for doing that for general mdspans without writing out atomic_ref everywhere inside a complex algorithm and also fixes the fact that you can't do the simple ++ with relaxed memory order - enough for these kind of accumulation cases.

What this paper proposes

"Bounded" versions of atomic_ref with a fixed (bounded) memory order
- atomic_ref_relaxed, atomic_ref_seq_cst, atomic_ref_acq_rel
- Conceptually just like atomic_ref with a fixed memory order - all the same members
Corresponding atomic accessors which use the various atomic_ref types as reference type
- atomic_accessor => atomic_ref
- atomic_accessor_relaxed => atomic_ref_relaxed
- atomic_accessor_acq_rel => atomic_ref_acq_rel
- atomic_accessor_seq_cst => atomic_ref_seq_cst

Histogram with atomic accessor

// sequenced_policy does not attach atomic accessor
template<class T, class Extents, class LayoutPolicy>
auto add_atomic_accessor_if_needed(
    std::execution::sequenced_policy, mdspan<T, Extents, LayoutPolicy> m) {
        return m;
 }

// parallel policies attach atomic accessor:
template<class ExecutionPolicy, class T, class Extents, class LayoutPolicy>
auto add_atomic_accessor_if_needed(
    ExecutionPolicy, mdspan<T, Extents, LayoutPolicy> m) {
        return mdspan(m.data_handle(), m.mapping(), atomic_accessor<T>());
}

template<class ExecT>
void compute_histogram(ExecT exec, float bin_size,
               std::mdspan<int, std::dextents<size_t,1>> output,
               std::mdspan<float, std::dextents<size_t,1>> data) {
  static_assert(std::is_execution_policy_v<ExecT>);

  auto accumulator = add_atomic_accessor_if_needed(exec, output);

  std::for_each(exec, data.data_handle(), data.data_handle()+data.extent(0), [=](float val) {
    int bin = std::abs(val)/bin_size;
    bin = std::clamp(bin, size_t(0), output.extent(0));
    // this is now atomic for parallel policies
    accumulator[bin]++;
  });

Previous Question: Why not expose the templated type?

There is really no use-case where one would want to write something generic over the memory order!
The algorithm determines the memory order you need - the only generic question we identified is whether you need atomics at all (i.e. what execution policy is used)
SG1 also felt that its easier to see the memory order with the explicit names
- We don't feel there is any real risk here and mdspan somewhat hides its accessor anyway

Wording Considerations

Used exposition only atomic-ref-bound<T,MemoryOrder> for wording
- atomic_ref_MEM_ORDER are template aliases - see open question whether this is ok
atomic_ref has 4 specializations: general, integral (not bool), floating point and pointer
- atomic-ref-bound needs only 2 (general and arithmetic/pointer{not-bool} versions)
- integral, floating point and pointer type versions can be defined in terms of atomic_ref with constraints on member functions
- the only reason we need general is difference_type member typedef -> doesn't exist in atomic_ref general

Open Questions

Is reasoning for having atomic-ref-bound exposition only good enough, or does LEWG want it as a public thing?
- Please Poll: atomic-ref-bound should be exposition only.
- Author position: in favor
Converting between atomic-ref-bound and basic-atomic-accessors of different memory orders
- We don't see many good reasons to have them:
- atomic_ref is meant to be almost used in-place for individual operations e.g. atomic_ref(a[i])++
- atomic accessors are something you are supposed to add in a fairly local scope
- we don't expect folks to pass mdspan with atomic accessors around
- There is a reason not to permit it:
- Writing functions which take atomic_ref_MEM_ORDER imply ordering behavior: allowing conversions would change that behavior
- Please Poll: atomic-ref-bound and basic-atomic-accessor should have converting constructors for different memory-orders
- Author position: against

ORNL / cpp-proposals-pub