kokkos / mdarray

Other
10 stars 9 forks source link

P1684: Consider adding a method to move the container out of the mdarray #22

Open mhoemmen opened 2 years ago

mhoemmen commented 2 years ago

This comes from 1684R2 LEWG review on 2022/04/19.

One reviewer suggested adding a method to move the container out of an mdarray. This would permit use cases like reclaiming and reusing an allocation for a sequence of mdarray.

Another reviewer pointed out that this might leave the container in an invalid state, since the extents and mapping would no longer correctly describe the amount of available storage. This would violate the policy that a moved-from object be left in a valid but possibly unspecified state.

@AnabelSMRuggiero has some ideas for an interface that would "sever" the mdarray from its storage. It would return both the container, and an mdspan viewing the container's storage. The result would ensure that the mdarray was in a safely usable moved-from state. For example, it could move the layout to the resulting mdspan, and assign zero to all dynamic extents. The only issue would be if the mdarray has all static extents, but the container has a move behavior more like vector than array. This would make it impossible for the "severed" mdarray to have a state usable for anything other than destruction or copy assignment. That's probably OK, but we would need to think about whether we would want mdarray ever to be in that state.

crtrott commented 2 years ago

A "move_container" function would not in general be possible, for example an all static extent layout_left can not be put into a state where its required_span_size is zero. So I don't think we should do this.

AnabelSMRuggiero commented 2 years ago

I think it's worth pointing out that you'd still have the same issue with move construction. I do not have a good solution to that issue beyond extending compile time extent types with a null-like value, which I'm not actually suggesting. The only way to never have to worry about having a container size of zero is to make it so move construction is equivalent to copy construction, but I think that is an issue severable (hah) from the sever idea I shared with Mark.

The issue can be dealt with separately because I'm suggesting composing move construction (inside of sever) with some encapsulation violation on the object constructed inside of the function, and then have the function return the components as a pair of prvalues. If move construction of an mdarray is equivalent to a copy construction to begin with (for example, an mdarray on top of an array), then composing any other operations on top of the move would be with internals disjoint from the mdarray passed in.

Additionally, the default contract for moved from objects is to not mess with them, other than setting it to a well defined state (eg. vector::clear). Therefore, I think it is fair to expect users to treat mdarray in the same way in this regard.

The last point I want to make: the idea I suggested to Mark is meant to be more all-in than std::move; regardless of if the rvalue is used for move construction, the original mdarray would be in a "moved-from" state. In other words, it's not just access to the held container that could lead to interfering with the invariants of mdarray, it's severing the relationship between the multidimensionality of the underlying range, and ownership of the range.

mhoemmen commented 2 years ago

@AnabelSMRuggiero makes a good point that the move constructor mdarray(mdarray&&), as currently specified, has the same issue. It would leave the moved-from object with a moved-from container, but an extents that continues to describe its old storage. In fact, the current spec would leave the the exposition-only array in the moved-from extents with all the run-time extents unchanged (not set to zero). Thus, mdarray does currently have a moved-from state that would only be valid for destruction or reassignment.

Being able to move the container out of the mdarray could be useful, but it was just a suggestion, not a poll result.

nliber commented 2 years ago

A state where one can only assign-to/destroy is not the same as a valid but unspecified state. In a valid but unspecified state, one can call any function that doesn't have a precondition, as well as calling any function that has a precondition as long as the precondition has been satisfied.

For instance, can you call operator[] on a moved-from mdarray? Or even the copy or move constructor or assignment operator if the source is in the moved-from state?

We can, of course, explicitly add preconditions everywhere...

The question is: how important are the move operations?

AnabelSMRuggiero commented 2 years ago

The question is: how important are the move operations?

The motivation for the poll to have the default container be vector even in the case of all static-extents was to avoid move being equivalent to a deep copy (in the case of array) by default. Arguably, LWEG could be convinced that it doesn't make sense to allow non-copying moves with all static extents, but my point is that move(like) operations was a significant consideration during the past two meetings on mdarray.

mhoemmen commented 2 years ago

@nliber wrote:

A state where one can only assign-to/destroy is not the same as a valid but unspecified state.

Excellent point -- thank you : - )

For instance, can you call operator[] on a moved-from mdarray? Or even the copy or move constructor or assignment operator if the source is in the moved-from state?

We definitely need to say what operations are allowed in the moved-from state.

Are there any container requirements that specify what operations the container can support after move? I couldn't find any. If there are none, then we shouldn't allow users to do anything with a moved-from mdarray, other than destruction. (For example, users might implement a contiguous container that doesn't reset its pointer or size after move. That would make assignment after move invalid.)

nliber commented 2 years ago

Do we agree with: The moved-from state has to maintain the class invariants (otherwise they are not invariants). If we say the only ops allowed are assign-to and destroy, then the invariants are weak and we have to add preconditions to all the other operations.

There is the related issue on what state mdarray is left in when an operation on the internal container fails (throws an exception). For instance: the vector move assignment operator can throw (PMR allocators sigh).

More generally, if the internal container is put into valid but unspecified state (moved-from, mutating operation threw an exception, etc.), I don't believe that is sufficient to assert that mdarray is also in a valid but unspecified state.

(Right now flat_map is also struggling with these issues, as it wraps two containers. Their solution for failing ops is to allow the implementation to call clear() [a function with no preconditions that cannot fail] on its internal containers, which essentially puts flat_map into a well-known state. Can we possibly do something similar? They have no solution to the moved-from problem yet, as I just brought it up.)

nliber commented 2 years ago

Valid but unspecified means you can call any function w/o a precondition. In practice, for containers the ones that are usually called are destruction, copy/move construction, assign-to, assign-from and clear(). (In theory, there are a lot more, such as empty(), size(), begin(), end(), etc., but they aren't all that useful. You can also do things like if (!c.empty()) return c.front(); because you checked that the precondition for calling front() held.)

variant wrestled with this, and is littered with preconditions and behavior changes when it is in the doesn't hold a value / valueless by exception state.

nliber commented 2 years ago

Another question about the moved-from state: if it is different than all the other valid states, will it be possible to test if it is in that state?

AnabelSMRuggiero commented 2 years ago

I'm actually surprised vector's move assignment doesn't offer strong exception safety when allocators aren't equal; I had assumed it did at least, but yeah there's no guarantee that I can find for it.

As for mdarray, strong exception safety for move assignment can be done as follows: If the container's move assignment is noexcept, simply move assign the other vector. If not, check if the allocators are equal, if so, move assignment vector should be fine (I can't find a hard guarantee for this in the standard. I know that MSVC STL and libstdc++ dynamically dispatches to the noexcept path for move assignment of vector; not doing that would be silly for an implementation to do). The saddest path: copy construct a temporary container passing in mdarray's allocator (eg vector temp(other.container, this->container.get_allocator()) ), and then move assign from the temporary.

The nuclear option if we can't trust move assignment from a container with an equal allocator is to construct the temporary in the same way, and then destroy the mdarray's container subobject, followed by move construction with placement new over the old container's storage. This janky fall back should be fine unless I butchered my reading of 8 and 8.5 in basic.life, or if move construction of the container is not noexcept.

nliber commented 2 years ago

But it isn’t just vector, even if that gives us some better guarantees. We are specifying something that can adapt any contiguous sequence container.

To me, an exception being thrown during move is a very rare case, so other than making sure that our invariants are maintained, we shouldn’t be optimizing that at the expense of pessimizing the non-throwing case. Put another way, I’m only concerned about maintaining the basic exception safety guarantee, and not concerned with further reducing the likelihood of move throwing if it pessimizes the non-throwing case.

So, can we do that? If std::array is the container and moving an element throws, the mdarray invariants are not affected. The basic exception safety guarantee holds.

If vector throws during a move, can we just .clear() it (like flat_map is going to do)? Is that valid if all the extents are static? I ask because mdarray has no default constructor in that case, so that becomes a different state than is otherwise possible.

.clear() also works for all other sequence containers, as I believe having a .clear() is a requirement.

I don't think that there are any other repercussions in doing the above, are there?