Open zygoloid opened 1 year ago
Another consideration: should the expression category of a function call always be "initializing", or should it follow the "guaranteed in-place" nature of the call? We may be able to avoid materializing some temporaries if we say that the expression category depends on the type:
fn F() -> i32;
fn G(n: i32);
// Does this materialize an i32 temporary, or does it just pass an i32
// value returned from F() directly into G()?
fn H() { G(F()); }
Also of note: there are definitely types for which it's reasonable for the value representation to be const T
but for which the initializing representation should not be a simple value copy. For a type with a (non-trivial) destructor (eg, UniquePtr(U)
), it's fine to form a const T
value binding, pass it around, and use it while the original object is still alive. But it's not OK in general to return it by value from a function and have it outlive the original object, which will likely have been destroyed.
This suggests a refinement of the above rule: a type defaults to being returned in-place if either its value representation is not const T
or it has a non-trivial destructor.
Isn't the relevant criteria whether the move constructor is a bit-copy?
UniquePtr(U)
has a non-trivial destructor, but moving it is a bit-copy, so it is fine to return in a register. Since you can move by copying + destroying, types with trivial copy constructors and destructors also have trivial move, but it doesn't go the other direction.
Yes, I think instead of checking for a trivial destructor, we could check for a trivial destructive move. Either check seems OK to me, and it's plausible that we could encounter a type that has a trivial destructor but not trivial destructive move, so we could consider checking both properties.
Why check both though? UniquePtr(U)
has trivial destructive move, non-trivial destructor, and should be allowed to be returned in a register.
Because there can exist types that have trivial destruction but not a trivial destructive move. Eg, if a type logs every time an instance is created, but has a trivial destructor, we can still return it in a register, because we're returning a value, not an object.
Edit: To be clear, we can return in registers if either condition holds.
I don't think so: consider this C++ class:
class CircularBuffer {
public:
CircularBuffer() : first(&data[0]), last(&data[0]) { }
~CircularBuffer() { }
// ...
private:
char data[16];
// points into data
char* first;
char* last;
}
Trivial destructor, but can't be returned in registers (since non-trivial destructive move / "not trivially relocatable").
I think the key issue is when the address of the object is captured, which isn't really related to whether the destructor is trivial.
Hm, right. The relevant constraint in C++ is "trivial copy or move + trivial destructor", and I was hoping we could get away with dropping the first half since we're not actually constructing the new object as part of the return (that could happen in the caller depending on how they use the resulting value), but yeah, there's still a lifetime issue if the object holds pointers into itself.
So perhaps "trivial destructive move" is the best we can do, and it's certainly logical and a good match to the C++ rule.
I think there are two different non-in-place optimizations here, and they have different constraints:
1) If, for a type T
, a value created by a value binding can outlive the object that it is bound to, then a function can return a value representation for type T
in registers. This seems like it should generally be possible if (1) the destructor for T
is trivial (and so cannot invalidate any invariants of the information that the value representation describes), and (2) the value representation doesn't hold any handles to state whose lifetime is ending when the function returns (for example, pointers to the same object or to other objects that might be on the stack). The move operation is not relevant for this optimization.
2) If, for a type T
, an object can be moved to a different address in memory, then a function can return an object representation for type T
in registers, by pulling it out of memory in the callee and putting it back in memory in the caller. This should generally possible if the type supports a "move from one location in memory to another" operation that is either just moving the bits around, or if for example it supports some "prepare to be relocated" and "clean up after relocation" operations.
I suspect optimization (1) is more interesting, because it avoids going through memory unnecessarily, but it probably happens a lot less often because, for most interesting user-defined types, the value representation will be a pointer, which means it always contains a handle to state whose lifetime is ending when the function returns. In the case where the type's object and value representation are the same, these two optimizations seem like they might collapse to the same thing, but I think they actually don't, because optimization (2) is materializing an object in the caller (including, for example, a requirement to call a destructor) and optimization (1) is not.
- If, for a type
T
, a value created by a value binding can outlive the object that it is bound to, then a function can return a value representation for typeT
in registers. This seems like it should generally be possible if (1) the destructor forT
is trivial (and so cannot invalidate any invariants of the information that the value representation describes), and (2) the value representation doesn't hold any handles to state whose lifetime is ending when the function returns (for example, pointers to the same object or to other objects that might be on the stack). The move operation is not relevant for this optimization.
@zygoloid could you explain (1), maybe with an example? It also isn't obvious that you can construct an initializing expression from a value representation. It would be particularly useful if you give an example that can be returned due to (1) but not (2). I feel like criteria for (2) captures "nothing cares about the address of this object", which is what we really care about, but maybe there is another reason a type won't have a trivial/bit-copy move? My assumption was that "having trivial/bit-copy move" did not mean that we would materialized a representation in memory in order to then copy the bits -- I thought that criteria was sufficient to be able to do everything in registers.
It's a little hard to get to an example where optimization (1) is possible but (2) is not, because I think it requires the type to have a custom value representation, which isn't something I have a lot of experience with and examples of.
Here's a contrived example: suppose we have a type NumberAsText
whose object representation is an (i32, String)
, where the string holds a cache of the formatted string representation of the number, and the value representation is just an i32
.
Optimization (2) isn't possible for this type. It's not trivially relocatable, because the object representation contains a String
, which is not trivially relocatable (assuming an SSO string representation).
But optimization (1) is possible. A value binding for this type can outlive the object that it binds to, because an i32
value doesn't refer to the storage from which it was created, so we can return this type in registers as an i32
. (It's not completely clear whether the optimization is a good thing here; we're losing the cached string in the process, but that's something the type author opted into by excluding it from the value representation.)
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive
label. The long term
label can also be added for issues which are expected to take time.
This issue is labeled inactive
because the last activity was over 90 days ago.
We currently unconditionally permit
returned var
s in functions, and the implication seems to be that the assert in this testcase holds for any typeT
:But that's problematic: it means that a function like
F
that uses areturned var
(or, transitively, initializes its return object from a function that uses areturned var
) cannot return in registers, which means that our calling convention, at least for functions whose bodies we can't see, can't return in registers, even for types likei32
or()
.I think:
1) A type author should be able to choose whether their type is guaranteed to be returned in-place or not (and in the latter case might be returned in registers). 2) A function author should be able to choose whether they guarantee to return in-place or not.
As a starting point, how about this:
T
whose value representation isconst T*
defaults to being returned in-place; any type whose value representation isconst T
defaults to being returned by copy.-> T
return uses the default for the type; a function that uses a-> var T
return guarantees to return in-place.returned var
notation is only permitted in a function that is guaranteed to return in-place, either because the function uses-> var
or (optionally) because the type requires an in-place return.