transposed of a custom layout CustomLayout (currently, a layout that is not layout_left, layout_right, or layout_stride) results in layout_transpose<CustomLayout>. layout_transpose<CustomLayout> does not define submdspan_mapping. As a result, users won't be able to take submdspan of the result of transposed for their custom mapping, even if it would make sense to do so.
Relevance
Users may want to treat transposed as a general facility in their own algorithms.
P1673 implementers may find it useful to implement some algorithms generically. For example, it should be possible to write a "cache-oblivious" recursive version of matrix_product for custom layouts, as long as they are unique layouts. However, users would find it surprising for (say) the transposed(A) case to fall back to a slower implementation, just because submdspan doesn't work on transposed(A).
Suggested fix
Add submdspan_mapping overload for A with layout_transpose<CustomLayout>, constrained on submdspan_mapping(mdspan{A.data_handle(), A.mapping().nested_mapping(), A.accessor()) being well-formed. Define the overload so that submdspan_mapping(transposed(A), slice_r, slice_c) is submdspan_mapping(A, slice_c, slice_r).
We don't need to account for numbers of slice specifiers other than 2, because layout_transpose currently requires rank() == 2. LWG didn't so much like P1673 to have the analogous "generality hooks" left behind for later adoption of the batched proposal.
Define
submdspan_mapping
forlayout_transpose
Problem
transposed
of a custom layoutCustomLayout
(currently, a layout that is notlayout_left
,layout_right
, orlayout_stride
) results inlayout_transpose<CustomLayout>
.layout_transpose<CustomLayout>
does not definesubmdspan_mapping
. As a result, users won't be able to takesubmdspan
of the result oftransposed
for their custom mapping, even if it would make sense to do so.Relevance
transposed
as a general facility in their own algorithms.matrix_product
for custom layouts, as long as they are unique layouts. However, users would find it surprising for (say) thetransposed(A)
case to fall back to a slower implementation, just becausesubmdspan
doesn't work ontransposed(A)
.Suggested fix
Add
submdspan_mapping
overload forA
withlayout_transpose<CustomLayout>
, constrained onsubmdspan_mapping(mdspan{A.data_handle(), A.mapping().nested_mapping(), A.accessor())
being well-formed. Define the overload so thatsubmdspan_mapping(transposed(A), slice_r, slice_c)
issubmdspan_mapping(A, slice_c, slice_r)
.We don't need to account for numbers of slice specifiers other than 2, because
layout_transpose
currently requiresrank() == 2
. LWG didn't so much like P1673 to have the analogous "generality hooks" left behind for later adoption of the batched proposal.