mattkretz / wg21-papers

my papers to WG21 — the C++ committee
5 stars 7 forks source link

Constructing from a contiguous range - constraints and preconditions on sizes? #97

Open danieltowner opened 1 year ago

danieltowner commented 1 year ago

In https://isocpp.org/files/papers/P2876R0.html#contiguous_ranges we propose to add a constructor from a contiguous range:

basic_ simd(std::ranges::contiguous_range auto x);

We already have a constructor for contiguous iterator, where the precondition is that the iterator is for a valid range. What constraints or preconditions should we have for contiguous_range, where we have more information about how big the range is?

Firstly, if the range is a known size (e.g., an array) then I propose that we constrain the simd to be exactly the same size. Constructing a simd from a range which is too small introduces the issue of what to put in unused elements, and constructing from a range which is too big silently truncates. I think that exactly the right size seems a reasonable constraint and no precondition would be needed. If users want different behaviour then they can use the proposed resize function to explicitly communicate their intent.

If the range has an unknown size then no constraint on size can be applied, but a precondition could be added which matches that of contiguous_iterator constructor (i.e., the range must be valid for all elements of the simd). This doesn't make use of the runtime size information that is available, but that could be used in a few ways:

I don't like any of these options much, so I think I would make the wording have a precondition is that the range is valid (like contiguous_iterator) and when the extent is known it is constrained to be the same size as the simd. An implementation would be free to provide an assertion or similar that checks that the range is the same size as the simd. Maybe a note can be added stating this intention.

Thoughts?

mattkretz commented 1 year ago

unordered thoughts:

mattkretz commented 1 year ago

Here's something my prototype can do:

void f(std::vector<float> const& data, int offset)
{
  basic_simd x = data[offset, simd<float>::size];

The magic happens via a operator[](std::contiguous_range const& rng, size_t offset, auto size) overload. Since size is passed as an integral_constant the returned span extent is static and basic_simd can be deduced. Deduction isn't very important here, but slightly more interesting for

  basic_simd x = data[offset, std::cc<4>];

(or whatever we'll call std::cc (P2781)).

This solution is interesting because (in some way) it provides size-safe loads (to complement type-safe :wink:) from contiguous ranges. It's size-safe only on the simd side. The subscript operation on the contiguous range would probably still only have a precondition violation on out-of-bounds.

The complementary size-safe stores are harder because

  data[offset, x.size] = x;

requires a new span::operator=. It could possibly be implemented generically as a std::copy from the rhs range into the span's range. I'd expect compilers to turn these into a proper (aligned) SIMD store.

danieltowner commented 1 year ago
  • The deduction guides better be defined on basic_simd rather than the alias template simd.

That was something I wasn't sure about. I believe that C++23 allows deduction through an alias, so I wasn't clear on whether the proposal should have the aliased version, the basic_simd version, or both. In an experiment, gcc seemed to be happy with just the deduction on the alias, but clang only accepted the deduction on the basic_simd.

mattkretz commented 1 year ago

This is what I raised on the LEWG Chat:

CTAD questions:

  1. Clang doesn't implement alias template CTAD yet, right?
  2. GCC fails to do CTAD for std::simd a = b; with misleading error messages if b doesn't have the native width. IIUC that's because the second template parameter is non-deducible but defaulted to the native width.

Reduced test-case: https://godbolt.org/z/3a6sEaPqn

Is there anything we can do? Anyone wants to re-discuss the names of basic_simd and simd because of CTAD?

TL;DR If we define the deduction guide on simd then only default width simd can be deduced.