kokkos / array_ref

Polymorphic multidimensional array view
36 stars 9 forks source link

Summary of ABQ comments on `span` #46

Open dhollman opened 6 years ago

dhollman commented 6 years ago

I'm going through the comments on span from Albuquerque to both understand where span stands relative to incorporation into the IS (and as it relates to P0546, P0856, and P0860) and to make sure we haven't made the same mistakes in mdspan. I'll enumerate summaries of the issues here (some with bulleted responses about how we've treated that in mdspan)

Comments from a thread titled "Questions raised while trying to implement " on the lib-ext reflector:

Other Pre-meeting comments

Small Group discussion comments

Full group discussion

Other notes (not from span discussion, but encountered while working on it):

template< typename UType , typename ... UProperties >
constexpr mdspan( mdspan< UType , UProperties ... > const & ) noexcept;
template< typename UType , typename ... UProperties >
mdspan & operator = ( mdspan< UType , UProperties ... > const & ) noexcept;

One of the requirements is that "V::static_extent(r) == U::static_extent(r) or V::static_extent(r) == std::dynamic_extent for 0 < r < V::rank()". This should probably also include the possibility that U::static_extent(r) == std::dynamic_extent.

dhollman commented 6 years ago

Next question: where does this leave us on P0546

There are some issues that still need to be addressed here, but it seems reasonable to leave this as a separate paper. There are a lot of rules you might want to discuss for AccessProperties in general. For instance, can you unconditionally convert between span<int[], Property1> and span<int[], Property2>? What about span<int[]> and span<int[], Property> and back? I don't think we want to say "only one property", but this implies that the presence or absence of all (exponentially many) combinations of properties needs to be addressed. Moreover, every time someone introduces a new property, there is a precedent that implies that you'll have to implement (or prohibit) exponentially many types (though templates help with this some). We're dealing with this right now in the Executors paper, and it's not an easy problem. I think this discussion alone justifies a "generic" AccessProperties paper that sets out some ground rules.

Some of the major insights from looking at the analogous discussion with executors:

More on this later, but the amount of text here seems to me to justify the existence of a "generic" properties paper to explore the customization point design itself.

We do still need P0856 and P0860 to be presented along with this, though, keeping in mind the following feedback from the end of the notes at the ABQ meeting:

we have precedent for poor design of customization points when they are not used (allocators).

dhollman commented 6 years ago

Oh, one more "executors lesson" I forgot: it pays to be very careful and deliberate early on about the difference between properties and their values. For instance, you shouldn't have a property called gpu_memory and one called cpu_memory because these are mutually exclusive (well, UVM aside...). Instead, there should be one property called something like memory_space with different values. Then you don't have to worry about all of this orthogonality stuff with respect to gpu_memory and cpu_memory; you just specify that a property can't have multiple values. (Note that "values" can still be types; they just have a specific internal mutual exclusion that properties don't have with other properties). This is probably obvious to those of us who have worked on Kokkos, but it still seems to be a point of confusion from time to time (even for me) on the executors proposal.

mhoemmen commented 6 years ago

mdspan is not iterable (?!?)

I'm OK with that. Iteration is a 1-D thing. Multidimensional iteration usually wants to know the indices.

Lots of discussion about constructors taking shared_ptr/unique_ptr (presumably the array partial specializations)

I've been prototyping stuff with {shared, weak, unique}_ptr lately. In particular, I wanted to model a memory pool: single owner, with multiple viewers. It's not really right to use unique_ptr for the owner, because of the non-unique viewers. I ended up using shared_ptr for the owner and weak_ptr for the viewers, so that viewers would have an option to test whether the owner still exists.

This is relevant because one important use case for an mdspan would be viewing or slicing some independently managed memory allocation. If we want constructors that take shared_ptr or unique_ptr, we beg the question about the ownership relationship between the input *_ptr and the mdspan. (Contrast that with the weak_ptr constructor that takes a shared_ptr. There, the relationship is clear.) If the mdspan just takes a raw pointer, we don't have to worry about defining that relationship.

mhoemmen commented 6 years ago

for each method or free function overload that a property cares about, it should be implementable as some combination of a pre-method hook and a post-method hook.

Hm, how do we define the order in which the hooks get applied? What if they don't commute? I guess they are supposed to be orthogonal; does that really just mean that properties always need to behave as if these hooks commute?

(Also, wouldn't you want to apply the post hooks in reverse order of the pre hooks?)

mhoemmen commented 6 years ago

for each method or free function overload that a property cares about, it should be implementable as some combination of a pre-method hook and a post-method hook.

Wouldn't that mean that atomic_access has to let non-atomic operator[] do its thing? I don't see how this would work.

mhoemmen commented 6 years ago

Oh wait, you could use C++17 constexpr if for this, right? Just protect the inner non-atomic operator[] with the constexpr if thing based on some combination of the properties.

mhoemmen commented 6 years ago

That constexpr if thing would work like this:

// class ... Props
if constexpr (... && invoke_default_implementation<Props>::value) {
  span<int[42]>::operator[] (whatever_arguments_go_here);
}
dhollman commented 6 years ago

Hm, how do we define the order in which the hooks get applied? What if they don't commute? I guess they are supposed to be orthogonal; does that really just mean that properties always need to behave as if these hooks commute?

I think the easy answer is that you would specify that the hooks are invoked in the order that the properties are given, and if they don't commute then that's the property implementer's fault (but it at least means that the user can figure out what's going on). It would mean that the properties are order dependent, but if the property implementers are being orthogonal enough this order should almost never matter (a lot of these operations should be compiler-reorderable anyway). Another option is to say that the invocation order is implementation-defined, meaning that any code that relies on an invocation order for these hooks would be disallowed (or something like that).

dhollman commented 6 years ago

Wouldn't that mean that atomic_access has to let non-atomic operator[] do its thing? I don't see how this would work

At least as I understand it, the atomic_access implementation we have uses a hash table of sharded locks that is dependent on the address of the mdspan itself (and maybe on the offset?). It's not actually using std::atomic objects. This sharded lock version could easily be implemented in terms of hooks. The constexpr if version isn't ideal because it lacks the extensibility of the hook version.

dhollman commented 6 years ago

The implementation of restrict_access in terms of hooks is a lot trickier, though. One option is to expose the type of the underlying pointer via a customization point. For instance (using the old, monolithic attributes class paradigm):

template <>
struct mdspan_property_attributes<restrict_access> {
  template <typename MDSpanType>
  struct pointer_type {
    using type = typename MDSpanType::pointer_type __restrict;
  };
  // ...
};

It's a bit difficult to see how this composes with other properties, but you could specify that it works as a chain just like the other hooks, for instance.

dhollman commented 6 years ago

Example of the hook I suggested above for restrict_access (seems to work with gcc, but nothing else)

https://godbolt.org/g/tHNBtj

mhoemmen commented 6 years ago

@dhollman wrote:

I think the easy answer is that you would specify that the hooks are invoked in the order that the properties are given, and if they don't commute then that's the property implementer's fault (but it at least means that the user can figure out what's going on).

I like this better.

mhoemmen commented 6 years ago

(seems to work with gcc, but nothing else)

Huh, works for recent enough Clang and Intel too, for me at least

dhollman commented 6 years ago

Huh, works for recent enough Clang and Intel too, for me at least

By "works", I don't mean that it just compiles. If you look at the instruction output, gcc "correctly" ends the two restrict overloads with movl $11, %eax (that is, return 11 as a constant), while clang and intel both do the actual addition in every case except for with the raw pointers (which they both get "correct" as well). I say "correctly" in quotes because they're allowed to completely ignore restrict, but the "performance correct" thing to do here is to return a constant.

mhoemmen commented 6 years ago

@dhollman Ah, I see, thanks!

dhollman commented 6 years ago

At least as I understand it, the atomic_access implementation we have uses a hash table of sharded locks that is dependent on the address of the mdspan itself (and maybe on the offset?).

Looking at the Kokkos code, I think I might be wrong about this — I must have been thinking of the atomics for arbitrary types thing from elsewhere. @crtrott and @hcedwar can correct me on this, but it looks like Kokkos implements the Kokkos::Atomic memory trait for, e.g., operator() and operator[] by modifying the return values of these methods to return something that essentially acts like (an extended version of) std::atomic. I'm still trying to figure out how this fits in to the property ontology. I have serious concerns about the usefulness of these properties in generic code if the the return type of operator() can be changed to something that doesn't support all of the same operations as the return type in the absence of that property, which would be the case if std::atomic were used. On the other hand, if we added something that wraps std::atomic and adds operator overloads, etc., I highly doubt LEWG and EWG would be okay with this kind of thing (i.e., if they thought we should have those operators on std::atomic, they would have put them there). @dsunder how are you addressing this in P0860?

hcedwar commented 6 years ago

span<T[],atomic_access>::operator[] returns atomic_ref<T> That is all that is required. The intent is that construction / setup of the span more efficient than construction of each individual atomic_ref.