Closed johan-overbye closed 2 years ago
@johan-overbye wrote:
First of all, thank you so much for this amazing proposal. Really hope it makes it into the standard!
Thanks for the kind words! WG21 voted it into C++ on Monday, so it is officially in the Standard (at least the current Standard draft).
... is that
submdspan
yields a dynamic extent even if a static extent is possible.
PR https://github.com/kokkos/mdspan/pull/159 , which merged yesterday, fixes this. You can see examples in the tests here. To ensure a compile-time extent corresponding to a given slice specifier:
std::integral_constant<Integral, k>
instead of an integer k
std::tuple<std::integral_constant<Integral, b>, std::integral_constant<Integral, e>>
instead of a pair or tuple of integersNote that submdspan
was removed from the proposal at LWG review time, before the plenary vote. Thus, we still have the opportunity to improve submdspan
. This new feature will be added to the separate submdspan
proposal in progress.
Thank you for your feedback and for trying out mdspan! I'll close the issue as a duplicate of the issue resolved by PR #159, but please feel free to reopen if this new feature does not meet your needs.
@mhoemmen Wow, congrats on the vote! That's great news. 🥳
Thanks for the heads-up on this new feature, however it doesn't quite seem to solve the use case I described unless I'm missing something.
In the use case above, the idea was that one or more extents of the subset are compile-time determined, but the corresponding offsets are run-time determined. This doesn't seem to be covered by the addition you describe, as it seems like the second element of the tuple/pair represents the past-the-end index rather than the extent of the subset, based on testing an earlier version of submdspan.
I've implemented and used two different libraries that resemble limited versions of mdspan
/mdarray
, and I've found that this use case crops up a lot, e.g. in linear algebra.
Wait a second, I see now that this new submdspan does support the particular example I highlighted above (grabbing a matrix row at a run-time-determined row index), because the row dimension is "collapsed". Very cool!
However I'd still argue the that enabling a run-time offset and a compile-time extent would be very useful. @mhoemmen
Just to clarify for other readers: your example using current submdspan
would look like this.
mdspan<float, extents<size_t, 4, 4>> matrix{ /* ... */ };
// row is layout_right and has extents<size_t, 4>
auto row = submdspan(matrix, 1, full_extent);
// row2 is layout_right and has extents<size_t, 4>
auto row2 = submdspan(matrix, integral_constant<size_t, 1>{}, full_extent);
// rows is layout_right and has extents<size_t, dynamic_extent, 4>
auto rows = submdspan(matrix, tuple{1, 3}, full_extent);
// rows2 is layout_right and has extents<size_t, 2, 4>
auto rows2 = submdspan(matrix, tuple{integral_constant<size_t, 1>{}, integral_constant<size_t, 3>{}}, full_extent);
// col is layout_stride and has extents<size_t, 4>.
auto col = submdspan(matrix, full_extent, 1);
// col2 is layout_stride and has extents<size_t, 4>.
// The strides can all be stored as compile-time constants.
auto col2 = submdspan(matrix, full_extent, integral_constant<size_t, 1>{});
// cols is layout_stride and has extents<size_t, 4, dynamic_extent>.
auto cols = submdspan(matrix, full_extent, tuple{1, 3});
// cols2 is layout_stride and has extents<size_t, 4, 2>.
// The strides can all be stored as compile-time constants.
auto cols2 = submdspan(matrix, full_extent, tuple{integral_constant<size_t, 1>{}, integral_constant<size_t, 3>{}});
@johan-overbye wrote:
However I'd still argue the that enabling a run-time offset and a compile-time extent would be very useful.
I'm not quite sure I understand what you mean. Could you give an example please?
If you pass in a single integer as a slice specifier, the corresponding extent of the input goes away; it does not appear in the extents
of the return type of submdspan
. It doesn't matter whether the integer is a run-time value or integral_constant
.
The only way to change the extents
of the return type of submdspan
is to pass in a pair or tuple of two integers as a slice specifier. In that case, if you pass in a pair or tuple of integral_constant
, the corresponding extent of the returned mdspan will be a compile-time extent (not dynamic_extent
). If you pass in a pair or tuple of integers (like int
or size_t
, you'll get dynamic_extent
. C++ doesn't have a way for a function to change its return type based on the value of a function parameter. This is because you can't declare a function parameter constexpr
. (That's why we added this integral_constant
feature.)
If you have a pair of constexpr
integer values, you can put them into the extents type by hand, via assignment. Here is an example.
mdspan<float, extents<size_t, 4, 4>> matrix{ /* ... */ };
constexpr int b = 1;
constexpr int e = 3;
// rows is layout_right and has extents<size_t, dynamic_extent, 4>
auto rows = submdspan(matrix, tuple{b, e}, full_extent);
// Assigning run-time to compile-time extents is legal
// as long as the extents are equal. Doing this generically
// requires knowing the correct layout and accessor.
mdspan<float, extents<size_t, e - b>> rows3 = rows;
@mhoemmen, thanks again for your time. These are super helpful examples.
Let's say I want a constexpr-sized 2x2 submatrix of mdspan<float, extents<size_t, 4, 4>> matrix{ /* ... */ };
, starting at an arbitrary non-constexpr offset int i, j;
. Meaning that matrix[i, j]
corresponds to submatrix[0, 0]
.
Ideally submatrix
's extents should be extents<size_t, 2, 2>
, but I can't achieve that (directly) with the current submdspan
as far as I can tell.
If you have a pair of
constexpr
integer values, you can put them into the extents type by hand, via assignment. Here is an example.
Great tip! This workaround definitely does the trick, but now that I've had my dream array/span system fall into my lap as if by magic, I've gotten greedy. 😄
Maybe it'd be easier to cater for what I'm suggesting if a slice were specified as start and extent rather than start and past-end.
@johan-overbye wrote:
Let's say I want a constexpr-sized 2x2 submatrix of
mdspan<float, extents<size_t, 4, 4>> matrix{ /* ... */ };
, starting at an arbitrary non-constexpr offsetint i, j;
. Meaning thatmatrix[i, j]
corresponds tosubmatrix[0, 0]
.
Thank you for explaining! You're right that this is not a feature of the current proposal. If you had constexpr i
and j
, you could do the following,
template<size_t Value>
using IC = std::integral_constant<size_T, Value>; // abbreviation
mdspan<float, extents<size_t, 4, 4>> matrix{ /* ... */ };
constexpr auto rows = std::tuple{IC<i>{}, IC<i+2>{}};
constexpr auto cols = std::tuple{IC<j>{}, IC<j+2>{}};
submatrix = submdspan(matrix, rows, cols);
static_assert(std::is_same_v<decltype(submatrix)::extents_type, extents<size_t, 2, 2>>);
but you don't have constexpr i
and j
, so this wouldn't work.
Maybe it'd be easier to cater for what I'm suggesting if a slice were specified as start and extent rather than start and past-end.
This is good thinking, but the current design is consistent with Fortran and Python, and is unlikely to change at this point. We could, however, extend submdspan
in the future to accept different slice specifier types.
For very small matrices, if you want best performance, you might not want to create subviews of the matrix at all (which is what submdspan
does). This is because mdspan always stores a pointer, and creating an mdspan with non-constexpr offsets requires pointer arithmetic. That's 64 bits of storage, and 64-bit arithmetic, compared with whatever tiny value type (e.g., int8_t
) you were storing in the mdarray's std::array
container. I wrote a bit about this in P1674.
Another option would be to do batched computations, so that you can amortize the pointer storage and arithmetic overhead over all the matrices in the batch.
Thanks again for your suggestions and explanations!
This is good thinking, but the current design is consistent with Fortran and Python, and is unlikely to change at this point. We could, however, extend
submdspan
in the future to accept different slice specifier types.
Gotcha. I'd love such a feature personally, but the current submdspan
does already cover about 90% of my use cases I think.
For very small matrices, if you want best performance, you might not want to create subviews of the matrix at all
Thanks for the heads-up! But it's not a big concern in my case as most of my uses for mdspan
and mdarray
are about optimising for conciseness and malleability rather than performance. If performance is crucial, SIMD or GPGPU will often be the way to go anyway.
Thanks a lot Mark, already super excited about this proposal. Swapped out my homebrew multi-dimensional array class for mdspan
and mdarray
in my maths library for a hobby project, and I'm loving it.
Just a couple of observations:
std::vector
over std::array
as mdarray
's container class when its extents are all static. Was a little bit surprised that it doesn't default to std::array
in that case.mdarray
s with all static extents seem to require an extra 4 bytes on top of the container. (Latest MSVC for x86-64 with /std:c++latest
.)mdarray
lacks a copy assignment operator. Intentional?@johan-overbye wrote:
Thanks a lot Mark, already super excited about this proposal. Swapped out my homebrew multi-dimensional array class for
mdspan
andmdarray
in my maths library for a hobby project, and I'm loving it.
Thanks so much! : - ) We really appreciate your interest and early adoption!
Please note that the reference implementations of both mdspan
and mdarray
live in the https://github.com/kokkos/mdspan
repository now. The old kokkos/mdarray
repository is no longer supported. Also, both implementations have been changed recently to match the latest revisions of the proposals. I would recommend updating your versions if you haven't already. Thanks!
I can't think of a situation where I'd want
std::vector
overstd::array
asmdarray
's container class when its extents are all static.
Thank you for your feedback! We've had a few LEWG discussions about the default container type. It's possible that this will change. There are a couple reasons why users might want to use vector
even with all compile-time extents.
vector
move is O(1); array
move is O(size()
).)
mdarray
s with all static extents seem to require an extra 4 bytes on top of the container. (Latest MSVC for x86-64 with/std:c++latest
.)
Implementations might be able to fix that with [[no_unique_address]]
. The reference implementation of mdarray
doesn't currently use that optimization:
but the reference implementation of mdspan
does:
I'll file an implementation issue; thank you for reporting this!
mdarray
lacks a copy assignment operator. Intentional?
The latest revision (R3) of the proposal has one. You can find the latest released revision using the "wg21.link" link service: https://wg21.link/p1684.
I would recommend updating your versions if you haven't already. Thanks!
Yep, all updated, thanks!
There are a couple reasons why users might want to use vector even with all compile-time extents.
Those make sense, thanks.
The reference implementation of mdarray doesn't currently use that optimization
OK, no problem! Just thought I'd flag it.
The latest revision (R3) of the proposal has one.
Brilliant, thank you.
Can't wait to see what's next. Thanks again for your time, and good luck!
First of all, thank you so much for this amazing proposal. Really hope it makes it into the standard!
One thing that's putting a slight damper on its usefulness, is that
submdspan
yields a dynamic extent even if a static extent is possible. In fact it looks like it can't be implemented to yield static extents, as the extents are determined by function arguments. Also,mdarray
support is sorely missed.Here's a suggestion for an alternate subset function:
The idea is that the desired extents of the subset are passed as template parameters, while the offsets are passed as function arguments.
collapse_dimension
can be of a dummy type, and indicate that that extent should be "collapsed" or removed from the resulting subset, similarly to passing a number instead of a pair to the currentsubmdspan
.