Open dhollman opened 6 years ago
There are some issues that still need to be addressed here, but it seems reasonable to leave this as a separate paper. There are a lot of rules you might want to discuss for AccessProperties
in general. For instance, can you unconditionally convert between span<int[], Property1>
and span<int[], Property2>
? What about span<int[]>
and span<int[], Property>
and back? I don't think we want to say "only one property", but this implies that the presence or absence of all (exponentially many) combinations of properties needs to be addressed. Moreover, every time someone introduces a new property, there is a precedent that implies that you'll have to implement (or prohibit) exponentially many types (though templates help with this some). We're dealing with this right now in the Executors paper, and it's not an easy problem. I think this discussion alone justifies a "generic" AccessProperties
paper that sets out some ground rules.
Some of the major insights from looking at the analogous discussion with executors:
span<double, atomic_access, bound_checking>
. The lesson from executors so far has been that it is a huge mess to allow some combinations and prohibit others. Properties should be orthogonal. If they aren't naturally so, your interface should essentially require them to "fake it" (more on this below)span
s with a combination of properties should be expressible in terms of attributes of the span
s with each of the properties individually. For instance, the noexcept
clause of span<T, Props...>
should be expressible as noexcept(noexcept(span<T,Props>{}[index_type{}]) && ...)
(if fold expressions worked on template parameters like this, which — side note — they should but don't). Places where this sort of expression was possible in the executors proposal have been way easier to deal with than places where it wasn't. I might even go so far as to say that parts of the general class template that can't be expressed this way shouldn't be allowed to be dependent on the AccessProperties...
. We should at least explore this option and weigh the costs of potentially overconstraining some properties with the benefits of simplicity.
atomic_access
and something like bounds_checking
. The single specialization versions of both would want to override operator[]
, so which one do you call? Does this require every property to know about every other property, and for you to implement all of the combinatorial possibilities? I suggest that we consider a different route on this. We should explore the implications of the following design principle: for each method or free function overload that a property cares about, it should be implementable as some combination of a pre-method hook and a post-method hook. For instance, span<int[42], atomic_access, bounds_checking>::operator[]
could be implemented as access_property_traits<atomic_access>::pre_subscript_hook(...)
, then access_property_traits<bounds_checking>::pre_subscript_hook(...)
, then span<int[42]>::operator[]
, then access_property_traits<atomic_access>::post_subscript_hook(...)
, then access_property_traits<bounds_checking>::post_subscript_hook(...)
, some of which could be no-ops. (This could also be implemented using the ADL-like traits idiom that seems to be favored now over the monolithic traits-class approach). Importantly, if this sort of thing gets people most of what they need, I don't think we should stall the proposal waiting for all of what we need.span
s with any set of properties is always convertible to another span
with different properties, as long as their property-less versions are interconvertible. Again, this may be too constraining, but if it gets everyone most of what they need, we should probably run with it.span<int[73],UProp,VProp,WProp>
to a const
reference to span<int[73],VProp,WProp,UProp>
causes the invocation of the copy constructor. We mostly can get away with this because we're working with a value type, but it's something to keep in mind.More on this later, but the amount of text here seems to me to justify the existence of a "generic" properties paper to explore the customization point design itself.
We do still need P0856 and P0860 to be presented along with this, though, keeping in mind the following feedback from the end of the notes at the ABQ meeting:
we have precedent for poor design of customization points when they are not used (allocators).
Oh, one more "executors lesson" I forgot: it pays to be very careful and deliberate early on about the difference between properties and their values. For instance, you shouldn't have a property called gpu_memory
and one called cpu_memory
because these are mutually exclusive (well, UVM aside...). Instead, there should be one property called something like memory_space
with different values. Then you don't have to worry about all of this orthogonality stuff with respect to gpu_memory
and cpu_memory
; you just specify that a property can't have multiple values. (Note that "values" can still be types; they just have a specific internal mutual exclusion that properties don't have with other properties). This is probably obvious to those of us who have worked on Kokkos, but it still seems to be a point of confusion from time to time (even for me) on the executors proposal.
mdspan is not iterable (?!?)
I'm OK with that. Iteration is a 1-D thing. Multidimensional iteration usually wants to know the indices.
Lots of discussion about constructors taking shared_ptr/unique_ptr (presumably the array partial specializations)
I've been prototyping stuff with {shared, weak, unique}_ptr
lately. In particular, I wanted to model a memory pool: single owner, with multiple viewers. It's not really right to use unique_ptr
for the owner, because of the non-unique viewers. I ended up using shared_ptr
for the owner and weak_ptr
for the viewers, so that viewers would have an option to test whether the owner still exists.
This is relevant because one important use case for an mdspan would be viewing or slicing some independently managed memory allocation. If we want constructors that take shared_ptr
or unique_ptr
, we beg the question about the ownership relationship between the input *_ptr
and the mdspan. (Contrast that with the weak_ptr
constructor that takes a shared_ptr
. There, the relationship is clear.) If the mdspan just takes a raw pointer, we don't have to worry about defining that relationship.
for each method or free function overload that a property cares about, it should be implementable as some combination of a pre-method hook and a post-method hook.
Hm, how do we define the order in which the hooks get applied? What if they don't commute? I guess they are supposed to be orthogonal; does that really just mean that properties always need to behave as if these hooks commute?
(Also, wouldn't you want to apply the post hooks in reverse order of the pre hooks?)
for each method or free function overload that a property cares about, it should be implementable as some combination of a pre-method hook and a post-method hook.
Wouldn't that mean that atomic_access
has to let non-atomic operator[]
do its thing? I don't see how this would work.
Oh wait, you could use C++17 constexpr if
for this, right? Just protect the inner non-atomic operator[] with the constexpr if thing based on some combination of the properties.
That constexpr if
thing would work like this:
// class ... Props
if constexpr (... && invoke_default_implementation<Props>::value) {
span<int[42]>::operator[] (whatever_arguments_go_here);
}
Hm, how do we define the order in which the hooks get applied? What if they don't commute? I guess they are supposed to be orthogonal; does that really just mean that properties always need to behave as if these hooks commute?
I think the easy answer is that you would specify that the hooks are invoked in the order that the properties are given, and if they don't commute then that's the property implementer's fault (but it at least means that the user can figure out what's going on). It would mean that the properties are order dependent, but if the property implementers are being orthogonal enough this order should almost never matter (a lot of these operations should be compiler-reorderable anyway). Another option is to say that the invocation order is implementation-defined, meaning that any code that relies on an invocation order for these hooks would be disallowed (or something like that).
Wouldn't that mean that atomic_access has to let non-atomic operator[] do its thing? I don't see how this would work
At least as I understand it, the atomic_access
implementation we have uses a hash table of sharded locks that is dependent on the address of the mdspan
itself (and maybe on the offset?). It's not actually using std::atomic
objects. This sharded lock version could easily be implemented in terms of hooks. The constexpr if
version isn't ideal because it lacks the extensibility of the hook version.
The implementation of restrict_access
in terms of hooks is a lot trickier, though. One option is to expose the type of the underlying pointer via a customization point. For instance (using the old, monolithic attributes class paradigm):
template <>
struct mdspan_property_attributes<restrict_access> {
template <typename MDSpanType>
struct pointer_type {
using type = typename MDSpanType::pointer_type __restrict;
};
// ...
};
It's a bit difficult to see how this composes with other properties, but you could specify that it works as a chain just like the other hooks, for instance.
Example of the hook I suggested above for restrict_access
(seems to work with gcc, but nothing else)
@dhollman wrote:
I think the easy answer is that you would specify that the hooks are invoked in the order that the properties are given, and if they don't commute then that's the property implementer's fault (but it at least means that the user can figure out what's going on).
I like this better.
(seems to work with gcc, but nothing else)
Huh, works for recent enough Clang and Intel too, for me at least
Huh, works for recent enough Clang and Intel too, for me at least
By "works", I don't mean that it just compiles. If you look at the instruction output, gcc "correctly" ends the two restrict overloads with movl $11, %eax
(that is, return 11 as a constant), while clang and intel both do the actual addition in every case except for with the raw pointers (which they both get "correct" as well). I say "correctly" in quotes because they're allowed to completely ignore restrict, but the "performance correct" thing to do here is to return a constant.
@dhollman Ah, I see, thanks!
At least as I understand it, the atomic_access implementation we have uses a hash table of sharded locks that is dependent on the address of the mdspan itself (and maybe on the offset?).
Looking at the Kokkos code, I think I might be wrong about this — I must have been thinking of the atomics for arbitrary types thing from elsewhere. @crtrott and @hcedwar can correct me on this, but it looks like Kokkos implements the Kokkos::Atomic
memory trait for, e.g., operator()
and operator[]
by modifying the return values of these methods to return something that essentially acts like (an extended version of) std::atomic
. I'm still trying to figure out how this fits in to the property ontology. I have serious concerns about the usefulness of these properties in generic code if the the return type of operator()
can be changed to something that doesn't support all of the same operations as the return type in the absence of that property, which would be the case if std::atomic
were used. On the other hand, if we added something that wraps std::atomic
and adds operator overloads, etc., I highly doubt LEWG and EWG would be okay with this kind of thing (i.e., if they thought we should have those operators on std::atomic
, they would have put them there). @dsunder how are you addressing this in P0860?
span<T[],atomic_access>::operator[]
returns atomic_ref<T>
That is all that is required.
The intent is that construction / setup of the span more efficient than construction of each individual atomic_ref.
I'm going through the comments on
span
from Albuquerque to both understand wherespan
stands relative to incorporation into the IS (and as it relates to P0546, P0856, and P0860) and to make sure we haven't made the same mistakes inmdspan
. I'll enumerate summaries of the issues here (some with bulleted responses about how we've treated that inmdspan
)Comments from a thread titled "Questions raised while trying to implement " on the lib-ext reflector:
mdspan
doesn't define comparison operators, not evenoperator==
(could this be a problem/inconsistency for some people? Probably it's okay to just tell people to convert to span first, especially with the potential for deduction guides on the conversion operators)span
s of unequal extents, sincestd::equal
andstd::lexigraphical_compare
do thisspan<int>
andspan<long>
should be comparableconst
reference rather than by value, even though the paper itself recommends passingspan
by valuesubspan
to take anmdspan
by value for the same reasonspan
currently has (or had?) a constructor that takes aContiguousContainer
(i.e., aContainer
whose iterators meet the requirements of contiguous iterators [27.2.1]). This isn't implementable via SFINAE (or any other compile-time technique) since contiguous iterator is a runtime property.mdspan
doesn't have this constructor. It doesn't look like we're using the term "contiguous" in any contexts that would imply the need for SFINAE, but we should double-check thisnumeric_limits<ptrdiff_t>::max()
(e.g., as could be the case ifsizeof(value_type)
is much larger than 1), thenas_bytes()
andas_writable_bytes()
could introduce undefined behavior.size_t
here instead. (Also, seems like a good motivation for theT[][][]
syntax, so that we don't have to use-1
for a magic extent value)ptrdiff_t
, so maybe we shouldn't change anything.nullptr_t
constructor, since it's the same as the default constructornullptr_t
is a reasonable trait to have. Either way, we omitted discussion/description of this constructor in the paper, even though it's in the wording.Other Pre-meeting comments
span<int, 42>.subspan<0>()
shouldn't return aspan<dynamic_extent>
dynamic_extent
magic number. It doesn't look like we have that problem because oursubspan
extents are always given as normal function arguments rather than template arguments (though this seems like a bit of a shortcoming since it means we can't have a subspan with static extents other than the entirety of the parent's extent for a given rank)constexpr
mdspan
is not iterable (?!?) so this doesn't apply.length
andsize
Small Group discussion comments
shared_ptr
/unique_ptr
(presumably the array partial specializations)noexcept
(presumably relating to comparison operators?)std::get
, since it is likestd::array
(though then it would have to participate in structured binding?)mdspan
. If you really need this, just convert to aspan
Full group discussion
array_view
, now that the termview
does not imply immutable (presumably with some change tostring_view
?)unique_ptr
andshared_ptr
constructors: 6-8-0-0-0nullptr_t
constructor: 4-5-3-0-0span(pointer, pointer)
constructor: 0-0-2-10-0Other notes (not from
span
discussion, but encountered while working on it):One of the requirements is that "
V::static_extent(r) == U::static_extent(r)
orV::static_extent(r) == std::dynamic_extent
for0 < r < V::rank()
". This should probably also include the possibility thatU::static_extent(r) == std::dynamic_extent
.mdspan
defines the constraints here in terms ofstd::is_assignable<U::pointer, V::pointer>
, butspan
defines it in terms of accessing aU::value_type
through aV::pointer
meeting the rules for well-defined object access defined in [basic.lval]/8. Are these definitions exactly equivalent?