Closed mhoemmen closed 1 year ago
@crtrott wrote:
You are storing two extents objects here, even though only a single dimension is padded. I think we can simplify this.
An excellent point -- I had originally thought that we wanted to pad all but a single dimension; that would have justified the current design more in terms of code reuse. Padding just a single dimension suggests a different way to compute strides.
Also I don't particular like the storage order part. For example: why not simply take another layout policy as an argument, if we anyway store an inner layout? ... That is assuming that simply introducing
layout_left_padded
andlayout_right_padded
is not the right thing to do in the first place.
We had a nice chat about this offline -- thanks!
Here is a summary of suggested revisions from this morning's offline discussion with Christian.
layout_padded
and the "BLAS general layouts" in P1673 have the same mathematical properties.
layout_stride
, but with one of the extents fixed to unit stride at compile time.layout_stride
is that the padding elements are "valid," that is, accessible. This means we can use them in optimizations -- e.g., to make copies contiguous. (This implies, in turn, that the only valid "nested layouts" for layout_padded
are layout_left
or layout_right
. It doesn't make sense to template on a general "inner layout.")Here are the only differences between the current layout_padded
design and P1673's BLAS general layouts.
layout_padded
expresses padding via "overalignment factor." This is really just an interface convenience for use with aligned_accessor
to make an aligned mdspan. Once layout_padded
computes the padded extents
, it no longer needs the overalignment factor. (We could create an alias and/or function to express a mapping from overalignment factor to padded layout.)This suggests that we only need two new layouts to express both layout_padded
and P1673's BLAS general layouts: layout_left_padded
, and layout_right_padded
. We don't use an enum to distinguish the two cases, for the same reason that layout_left
and layout_right
are separate types and not one type with an enum non-type template parameter.
As a size optimization, we only need to store the padded extents
and the one input extent that was padded (and thus differs from the corresponding padded extent). We could use extents<index_type, InputExtent>
(or its underlying "partially static array" representation) to represent the one input extent as either a compile-time or a run-time value.
It would make sense for submdspan
of a layout_(left,right)
mdspan, with the appropriate slices, to produce a layout_(left,right)_padded
mdspan.
I've converted this to a draft, so I can finish implementing the new layout_{left,right}_padded
design that Christian and I discussed a few days ago. This will make it possible to unify the P1673 "BLAS general layouts" with padded layouts.
@crtrott @dalg24 @nliber This PR is ready for re-review; thanks!
I've handed off this work to a collaborator for now.
This PR is superseded by PR #237. Thanks all!
I added a
layout_padded
example. This is a strided layout that ensures a given overalignment of the stride-1 extent (either the leftmost, for column-major mode, or the rightmost, for row-major mode).I've tested this on a few different compilers, as one can see here: https://godbolt.org/z/3Kn833dj8
-std=c++2b
-std=c++20
-std=c++17
size_t
!) MSVC 19.32/std:c++latest
-std=c++14