Open danieltowner opened 1 year ago
unordered thoughts:
basic_simd
rather than the alias template simd
. My span deduction guide:
template <class _Tp, size_t _Extent>
basic_simd(std::span<_Tp, _Extent>) -> basic_simd<_Tp, __detail::__deduce_t<_Tp, _Extent>>;
template <std::ranges::contiguous_range _Rg>
basic_simd(const _Rg& x)
-> basic_simd<std::ranges::range_value_t<_Rg>,
__detail::__deduce_t<std::ranges::range_value_t<_Rg>,
__detail::__static_range_size<_Rg>>>;
with
template <std::ranges::sized_range _Tp>
constexpr inline size_t
__static_range_size = []() {
if constexpr (requires { {_Tp::size()} -> std::integral; })
return _Tp::size();
else if constexpr (requires { {_Tp::extent} -> std::integral; })
return _Tp::extent;
else if constexpr (requires { {std::tuple_size_v<_Tp>} -> std::integral; })
return std::tuple_size_v<_Tp>;
else
return std::dynamic_extent;
}();
This would allow user-defined ranges to be deduced correctly, if they implement the necessary duck-typing.
simd
by calling std::ranges::begin
on it. So I'm unconvinced that we want a simd
constructor from contiguous_range
where the size isn't known at compile-time. And it feels wrong to have a throwing simd
constructor. If anything it's a precondition. Which may be come diagnosable in standard way if we ever get Contracts into the language.Here's something my prototype can do:
void f(std::vector<float> const& data, int offset)
{
basic_simd x = data[offset, simd<float>::size];
The magic happens via a operator[](std::contiguous_range const& rng, size_t offset, auto size)
overload. Since size
is passed as an integral_constant
the returned span
extent is static and basic_simd
can be deduced. Deduction isn't very important here, but slightly more interesting for
basic_simd x = data[offset, std::cc<4>];
(or whatever we'll call std::cc
(P2781)).
This solution is interesting because (in some way) it provides size-safe loads (to complement type-safe :wink:) from contiguous ranges. It's size-safe only on the simd
side. The subscript operation on the contiguous range would probably still only have a precondition violation on out-of-bounds.
The complementary size-safe stores are harder because
data[offset, x.size] = x;
requires a new span::operator=
. It could possibly be implemented generically as a std::copy
from the rhs range into the span's range. I'd expect compilers to turn these into a proper (aligned) SIMD store.
- The deduction guides better be defined on
basic_simd
rather than the alias templatesimd
.
That was something I wasn't sure about. I believe that C++23 allows deduction through an alias, so I wasn't clear on whether the proposal should have the aliased version, the basic_simd
version, or both. In an experiment, gcc seemed to be happy with just the deduction on the alias, but clang only accepted the deduction on the basic_simd
.
This is what I raised on the LEWG Chat:
CTAD questions:
- Clang doesn't implement alias template CTAD yet, right?
- GCC fails to do CTAD for
std::simd a = b;
with misleading error messages ifb
doesn't have the native width. IIUC that's because the second template parameter is non-deducible but defaulted to the native width.Reduced test-case: https://godbolt.org/z/3a6sEaPqn
Is there anything we can do? Anyone wants to re-discuss the names of basic_simd and simd because of CTAD?
TL;DR If we define the deduction guide on simd
then only default width simd
can be deduced.
In https://isocpp.org/files/papers/P2876R0.html#contiguous_ranges we propose to add a constructor from a
contiguous range
:basic_ simd(std::ranges::contiguous_range auto x);
We already have a constructor for contiguous iterator, where the precondition is that the iterator is for a valid range. What constraints or preconditions should we have for
contiguous_range
, where we have more information about how big the range is?Firstly, if the range is a known size (e.g., an array) then I propose that we constrain the simd to be exactly the same size. Constructing a simd from a range which is too small introduces the issue of what to put in unused elements, and constructing from a range which is too big silently truncates. I think that exactly the right size seems a reasonable constraint and no precondition would be needed. If users want different behaviour then they can use the proposed
resize
function to explicitly communicate their intent.If the range has an unknown size then no constraint on size can be applied, but a precondition could be added which matches that of
contiguous_iterator
constructor (i.e., the range must be valid for all elements of the simd). This doesn't make use of the runtime size information that is available, but that could be used in a few ways:std::max(range.size(), simd.size())
values. The runtime size is used to read just enough data without causing memory read issues , but then the user would be unaware of any truncations or inserts to match the sizes.I don't like any of these options much, so I think I would make the wording have a precondition is that the range is valid (like contiguous_iterator) and when the extent is known it is constrained to be the same size as the simd. An implementation would be free to provide an assertion or similar that checks that the range is the same size as the simd. Maybe a note can be added stating this intention.
Thoughts?