Define `submdspan_mapping` for `layout_transpose`

Problem

transposed of a custom layout CustomLayout (currently, a layout that is not layout_left, layout_right, or layout_stride) results in layout_transpose<CustomLayout>. layout_transpose<CustomLayout> does not define submdspan_mapping. As a result, users won't be able to take submdspan of the result of transposed for their custom mapping, even if it would make sense to do so.

Relevance

Users may want to treat transposed as a general facility in their own algorithms.
P1673 implementers may find it useful to implement some algorithms generically. For example, it should be possible to write a "cache-oblivious" recursive version of matrix_product for custom layouts, as long as they are unique layouts. However, users would find it surprising for (say) the transposed(A) case to fall back to a slower implementation, just because submdspan doesn't work on transposed(A).

Suggested fix

Add submdspan_mapping overload for A with layout_transpose<CustomLayout>, constrained on submdspan_mapping(mdspan{A.data_handle(), A.mapping().nested_mapping(), A.accessor()) being well-formed. Define the overload so that submdspan_mapping(transposed(A), slice_r, slice_c) is submdspan_mapping(A, slice_c, slice_r).

We don't need to account for numbers of slice specifiers other than 2, because layout_transpose currently requires rank() == 2. LWG didn't so much like P1673 to have the analogous "generality hooks" left behind for later adoption of the batched proposal.

ORNL / cpp-proposals-pub