hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.24k stars 224 forks source link

[SUGGESTION] Only apply move from last-use behaviour to `@basic_value` and well-known STL types (or appropriate types detected at compile-time) #1030

Closed bluetarpmedia closed 3 months ago

bluetarpmedia commented 3 months ago

Background:

The current move from last-use behaviour causes problems with RAII types, and types that have non-const (inout this) member functions with side effects, where the programmer does not want to inspect the object's value after calling the non-const member function.

For example, maybe that function notifies an observer, or calls an OS/graphics API, or a different thread will read the object's new value.

See: #999 and #1002

Suggestion:

Move from last-use seems perfectly suited to value types, as opposed to RAII, guards, managers, and non-value types that model the problem domain. So the move from last-use behaviour could be applied only to Cpp2 types which implement the @basic_value metafunction, as well as well-known STL types (or, if possible to be generic, types that meet some suitable compile-time criteria).

Alternatives considered:

The above suggestion is effectively an "opt-in" approach for the move from last-use feature.

An alternative would be to make move from last-use "opt-out", using something like:

However, since one of Cpp2's stated goals is to have "full backward compatibility" with ISO C++, and Cppfront enables a mixed-mode for side-by-side easy migration, I'd expect to be able to use my Cpp1 types seamlessly in Cpp2 code without changing their semantics. E.g. a RAII type from C++ should work in Cpp2 like it already does, just like std::string works in Cpp2 like it already does in Cpp1. That's why I think opting-in to the Cpp2 move from last-use behaviour is the correct choice.

DyXel commented 3 months ago

I am on the opposite side of the coin: I would prefer the automatic moves to be opt-out. I really like Cpp2's embracement of value semantics, and to be honest it works most of the time so far (perhaps biased view on my side?).

Let me ask directly: What do you (and others) think about using &* as an opt-out mechanism as demonstrated in my comment https://github.com/hsutter/cppfront/issues/1002#issuecomment-2003294955, which seems to be working mostly as intended already. If you embrace value semantics like Cpp2 is trying, then its clear why RAII types fall under a special category (as outlined by gregmarr's https://github.com/hsutter/cppfront/issues/1002#issuecomment-2004896324), and in that case I would expect some manual intervention to mark these calls as "yes, I do not intend to look at this object's value myself any longer, I want the RAII wrapper to do things for me without 'stealing' the lifetime from me so I need to opt out of the move".

Finally, I'd like to add that AFAIK, we can only rely on local scope analysis for this, so adding metafunctions or attributes outside the function where the problem occur is probably not gonna work. And I wouldn't like having to mark/annotate these things anyway; It should be as automatic and efficient as possible by default.

hsutter commented 3 months ago

Thanks @bluetarpmedia ! Your "value type" is a great observation, I think you hit the nail on the head. Move from last use is primarily for value types.

I just did a test implementation to do last-use moves only from copyable types, and found that this path:

However, there is one consequence: Move-only types now need to say (move x) on definite last use, which they didn't before. (This required updating two places in reflect.h2. That was pretty much it, except lots of places pure2-last-use.cpp2 test cases which uses move-only unique_ptr heavily and is intentionally full of last-use cases.)

Consider unique_ptr and unique_lock have similar designs:

take_over: (move _: std::unique_ptr_or_lock<int>) = { /*...*/ }

f: () = {
    x: std::unique_ptr_or_lock = /*...*/ ;
    take_over( move x );   // 'move' is now required
}

Is requiring this 'move' good or bad? On the "good" side, it's more explicit; move is the only value-like thing you can do to a move-only type, but one perspective is that this makes it a more impactful operation that should be called out when it happens. On the potentially "bad" side, will the programmer feel surprised because they know they get move from definite last use for copyable types when they're no longer using the object (the same code with shared_ptr would get the efficient move automatically)... would they expect that for move-only types too, especially since move is all you can do with a move-only type?

So on the one hand, it seems to be that move-only types are designed to be moved around, so should be move candidates. While there will be some APIs like cv.wait that will have a side effect and not consume them, it seems to me those are likely to be outliers? On the other hand, move-only types are rare, and special... and for me today that tipped the balance in favor of requiring move, especially since doing that resolved the other four open issues.

But I think the updated story is still consistent, and I hope easy to teach: "Copyable objects get the optimization of getting automatically treated as a move-candidate from their definite last use." For move-only types it wouldn't be an optimization, and you should still write move to be self-documenting.

WDYT?

DyXel commented 3 months ago

I think this is for sure better than what I proposed, requires less annotation overall. Also, while I think having to write the move for move-only types is redundant, I'm also aware that this was already what we were doing for things like unique_ptrs and locks, so less delta with C++.

bluetarpmedia commented 3 months ago

Nice work Herb, this is a great solution!

I think it's a positive sign that this change resolved a bunch of issues but had minimal impact on the test cases (in a wider sense; obviously one in particular required lots of changes.)

For copyable objects which also provide move semantics, move is an optimisation for copy, so it makes sense to me that this happens implicitly where possible.

But for move-only objects, move cannot be an optimisation for copy, so the explicit syntax makes sense to me here.

I feel like this is actually an improvement over C++, because in C++ we write std::move for two different cases:

MaxSagebaum commented 3 months ago

Sound nice and having the move for move-only types there is probably more readable.

iprtel commented 3 months ago

If we differentiate move-only types could we have a warning if they are used after they got moved out?

bluetarpmedia commented 3 months ago

If we differentiate move-only types could we have a warning if they are used after they got moved out?

That may be difficult because (at least for STL types), at a minimum a move leaves the object in a "valid but unspecified" state. So it's then valid to call a member function on a moved-from-object if that function has no preconditions. You can then put the object back into a well-known state. E.g.

main: () = {
    v1: std::vector = (11, 22, 33);
    v2:= (move v1);

    // This is fine; `clear` has no preconditions and
    // puts the vector back into a well-known state.
    v1.clear();
    v1.push_back(44);

    std::print("v1: {} v2: {}", v1, v2);
}

(See C++ standard "[lib.types.movedfrom]".)

Some types have stronger guarantees. For example, a move from unique_ptr actually leaves the object in a valid and well-known state containing nullptr. (See C++ standard "[unique.ptr.single.ctor]").

iprtel commented 3 months ago

Yes, I understand that move from objects are 'valid but unspecified' but so are uninitialized int variables but we still consider them a problem. So maybe we could treat moved from objects like uninitialized. You need to bring them into a know state and since they are still valid that is possible. In my pull request #914 I tried to create an 'owning reference' that is like a unique pointer only it cannot be null. You can hand the ownership to another reference but than the initial reference must not be used. I can enforce that with a runtime check but would like to have some compiler support. If it was treated like uninitialized it could trigger an error.

SebastianTroy commented 3 months ago

Are there any performance issues with considering moved from objects as uninitialised? If not, it sounds like an excellent idea IMO, and I know that certain actions on moved from objects are legal, but are any of them useful?

On 22 March 2024 22:33:14 Ingo Prötel @.***> wrote:

Yes, I understand that move from objects are 'valid but unspecified' but so are uninitialized int variables but we still consider them a problem. So maybe we could treat moved from objects like uninitialized. You need to bring them into a know state and since they are still valid that is possible. In my pull request #914https://github.com/hsutter/cppfront/pull/914 I tried to create an 'owning reference' that is like a unique pointer only it cannot be null. You can hand the ownership to another reference but than the initial reference must not be used. I can enforce that with a runtime check but would like to have some compiler support. If it was treated like uninitialized it could trigger an error.

— Reply to this email directly, view it on GitHubhttps://github.com/hsutter/cppfront/issues/1030#issuecomment-2016032370, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AALUZQJ5SQA33VE52GWQAGLYZSWSPAVCNFSM6AAAAABE4UCVXGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWGAZTEMZXGA. You are receiving this because you are subscribed to this thread.Message ID: @.***>