ORNL / cpp-proposals-pub

Collaborating on papers for the ISO C++ committee - public repo
26 stars 27 forks source link

Status of 'preferred array extents mechanism' for mdspan #83

Open willwray opened 5 years ago

willwray commented 5 years ago

What is the status of C-style array syntax for extents T[M][N] for mdspan? Is it still 'preferred'?

At some point around r5 of P0009 this 'preferred' natural syntax for extents was bumped to an appendix and to P0332 for proposals to relax language constraints on incomplete bounds. Then, for the last few revisions, I don't see a reference.

Thanks for any pointers - forum discussions or elsewhere in github issues perhaps.

crtrott commented 5 years ago

This met intense dislike in the committee and we dropped trying to get that through for now. But mdspan will most likely only make a TS so there may still be a chance to change this.

willwray commented 5 years ago

Was the dislike principled? Were the reasons documented anywhere? The r4 & r5 appendix implied that LEWG liked it.

The preferred mechanism is compact, is intuitive, LEWG has straw-polled strong preference, and users have voiced strong expressed preference.

Difficulty of handling extents? Extent manipulation can mostly be done with one-liners in C++20, e.g. extracting extents to an array: https://godbolt.org/z/JZDVcm

template <typename A>
inline constexpr auto extents =
[]<std::size_t... i>(std::index_sequence<i...>){
    return std::array{std::extent_v<A,i>...};
}(std::make_index_sequence<std::rank_v<A>>{});
crtrott commented 5 years ago

Yeah the library folks liked it or at least were ok with the change. The push back came in the language group. I am not sure how deep the dislike sits. It might be that it is mostly a "we need more time to get convinced that the required change doesn't have weird side effects". Everything else in mdspan can be done in the library, but this change requires a C++ language change in the type system.

brycelelbach commented 5 years ago

The relaxed array type syntax is a deadend at this point in time.

On Sat, Nov 3, 2018, 6:44 PM Christian Trott <notifications@github.com wrote:

Yeah the library folks liked it or at least were ok with the change. The push back came in the language group. I am not sure how deep the dislike sits. It might be that it is mostly a "we need more time to get convinced that the required change doesn't have weird side effects". Everything else in mdspan can be done in the library, but this change requires a C++ language change in the type system.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ORNL/cpp-proposals-pub/issues/83#issuecomment-435634858, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYTchndmJ_PqUUkhl0i0iPSM1LURu-Fks5urkaAgaJpZM4YNBI3 .

hcedwar commented 5 years ago

There was trepidation in EWG about side effects, such as introducing bugs into compilers. We did a pretty thorough analysis in P0332 to show that the feature is, from a language specification perspective, well-defined. In P0332 we even tightened up the language for incomplete types.

Some committee member would have to advocate for the current comprehensive P0332 in EWG. When I was leading the Sandia Labs contingent I was this advocate on behalf of Sandia's HPC developers who had asked for this feature for many years. However, I am no longer with Sandia and as such no longer the appropriate advocate.

willwray commented 5 years ago

Thanks,

I hadn't heard about P0332's lack of progress.

I wonder whether the alternative was considered of using pointers * in place of incomplete-bounds []:

Types with all extents static remain as: T[a], T[a][b], T[a][b][c] ... Types with 1st extent dynamic remain: T[], T[][b], T[][b][c] ...

Types with all extents dynamic become T[], T(*)[], T(**)[] ... (or possibly T[], T[]*, T[]** ... with proposed alternative syntax below).

Mixed cases with intermediate incomplete bounds are more awkward to write:

for T[3][][2][1] use T(*[3])[2][1]

Types with dynamic bounds can be written with a template alias when needed; perhaps mdarray_t<T,3,0,2,1> (or should that be mdarray_t<T,1,2,0,3>) with 0's decoded as *'s (or as incomplete bound [] if in 1st position).

Or, nicer, we propose int[3]*[2][1] as equivalent syntax for int(*[3])[2][1]

Moving the * pointer declaration to the RHS works nicely as 'wildcard' syntax!

Proposing this as an alternative declaration syntax may be easier than P0332 (there's no change to the type system itself, just an aliased declaration syntax).

I believe that Bjarne spent time considering alternative declaration forms so he'd be able to judge quickly if there are parsing issues with this syntax. If anyone's at the ISO meet then perhaps you could float the idea. There's precedent with cv qualifiers that can go on the left or the right...

willwray commented 5 years ago

Found it! With a link. Bjarne discusses this 'postfix pointer' declaration syntax in D&E:

Together with Doug McIlroy, Andrew Koenig, Jonathan Shopiro, and others I considered introducing postfix “pointer to” operator -> as an alternative to the prefix *:

Design & Evolution section 2.8.1 The C Declaration Syntax (This section is currently available as sample content in the Google Play edition)

int v[10]->; // array of pointers to ints
int p->[10]; // pointer to array of ints

The reason for using -> rather than * is not explained. The types themselves would be int[10]-> and int->[10]. There is ambiguity with * in the 'first position' so int->[10] is not int*[10]. However, in array context, it is better written as an incomplete bound, int[][10]. Beyond the first position is there ambiguity? If there is I don't see it. If not then we can use * in postfix except in first position where [] is needed.

int v[10]*; // array of pointers to ints
int p[][10]; // pointer to array of 10 ints

My eventual rationale for leaving things as they were was that any new syntax would (temporarily at least) add complexity to a known mess. Also, even though the old style is a boon to teachers of trivia and to people who want to ridicule C, it is not a significant problem for C programmers. In this case, I’m not sure if I did the right thing, though.

willwray commented 5 years ago

FYI I started a thread on ISO C++ Standard - Future Proposals mailing list about this postfix pointer notation East pointers-to-array (Already regretting that title and wishing I'd gone with 'Postfix pointer'.)

willwray commented 5 years ago

Lots of errors in my Future Proposals OP and thread so I'm closing it. Also an error above

int p[][10]; // pointer to array of 10 ints
//  ^^^^^^^  **WRONG ** : Not an object type 

If postfix looks good with more research then I'll open another thread on "Reviving postfix pointer notation".

willwray commented 5 years ago

Reluctantly coming to the conclusion that built-in C-array types are not suitable in general for specifying the extents of multi-dim array(-reference) classes.

  1. An implementation barrier; GCC and Clang currently error out for simply naming large array types https://godbolt.org/z/H-_GqM:

    constexpr int N = 0x10000;    // 2^16 = 65536
    using a64 = int[N][N][N][N];  // GCC/Clang error: type too large

    Clang errors at ~256 petabyte, GCC ~ 1 exabyte. MSVC is able to process large types.

  2. A C-array type advertises more than just base type and extents; the type implies a hierarchical memory layout of recursive subarray subobjects while mdspan expects a contiguous layout of value_type objects.

A C-array type makes sense for a reference-wrapper that actually wraps a C-array of that type (a compatibility wrapper to extend std features to multidim C-arrays, with CTAD deduction). One would expect such a class to support C-like hierarchical indexing with operator[] etc. This is not an mdspan. It is something more constrained and more specialized.

Of course, an mdspan specialization could be defined to specify extents via built-in array type. This could form the easy-to-use / non-expert interface. Not sure now it's worth advocating array-extents.