carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
http://docs.carbon-lang.dev/
Other
32.31k stars 1.48k forks source link

representation for initializing expressions and putting function return values in registers #3133

Open zygoloid opened 1 year ago

zygoloid commented 1 year ago

We currently unconditionally permit returned vars in functions, and the implication seems to be that the assert in this testcase holds for any type T:

fn F[T:! type](v: T, p: T**) -> T {
  returned var ret: T = v;
  *p = &ret;
  return var;
}

fn G[T:! type](x: T) {
  var p: T*;
  var v: T = F(x, &p);
  Assert(&v == p);
}

But that's problematic: it means that a function like F that uses a returned var (or, transitively, initializes its return object from a function that uses a returned var) cannot return in registers, which means that our calling convention, at least for functions whose bodies we can't see, can't return in registers, even for types like i32 or ().

I think:

1) A type author should be able to choose whether their type is guaranteed to be returned in-place or not (and in the latter case might be returned in registers). 2) A function author should be able to choose whether they guarantee to return in-place or not.

As a starting point, how about this:

zygoloid commented 1 year ago

Another consideration: should the expression category of a function call always be "initializing", or should it follow the "guaranteed in-place" nature of the call? We may be able to avoid materializing some temporaries if we say that the expression category depends on the type:

fn F() -> i32;
fn G(n: i32);
// Does this materialize an i32 temporary, or does it just pass an i32
// value returned from F() directly into G()?
fn H() { G(F()); }

Also of note: there are definitely types for which it's reasonable for the value representation to be const T but for which the initializing representation should not be a simple value copy. For a type with a (non-trivial) destructor (eg, UniquePtr(U)), it's fine to form a const T value binding, pass it around, and use it while the original object is still alive. But it's not OK in general to return it by value from a function and have it outlive the original object, which will likely have been destroyed.

This suggests a refinement of the above rule: a type defaults to being returned in-place if either its value representation is not const T or it has a non-trivial destructor.

josh11b commented 1 year ago

Isn't the relevant criteria whether the move constructor is a bit-copy?

UniquePtr(U) has a non-trivial destructor, but moving it is a bit-copy, so it is fine to return in a register. Since you can move by copying + destroying, types with trivial copy constructors and destructors also have trivial move, but it doesn't go the other direction.

zygoloid commented 1 year ago

Yes, I think instead of checking for a trivial destructor, we could check for a trivial destructive move. Either check seems OK to me, and it's plausible that we could encounter a type that has a trivial destructor but not trivial destructive move, so we could consider checking both properties.

josh11b commented 1 year ago

Why check both though? UniquePtr(U) has trivial destructive move, non-trivial destructor, and should be allowed to be returned in a register.

zygoloid commented 1 year ago

Because there can exist types that have trivial destruction but not a trivial destructive move. Eg, if a type logs every time an instance is created, but has a trivial destructor, we can still return it in a register, because we're returning a value, not an object.

Edit: To be clear, we can return in registers if either condition holds.

josh11b commented 1 year ago

I don't think so: consider this C++ class:

class CircularBuffer {
 public:
  CircularBuffer() : first(&data[0]), last(&data[0]) { }
  ~CircularBuffer() { }
  // ...
 private:
  char data[16];
  // points into data
  char* first;
  char* last;
}

Trivial destructor, but can't be returned in registers (since non-trivial destructive move / "not trivially relocatable").

I think the key issue is when the address of the object is captured, which isn't really related to whether the destructor is trivial.

zygoloid commented 1 year ago

Hm, right. The relevant constraint in C++ is "trivial copy or move + trivial destructor", and I was hoping we could get away with dropping the first half since we're not actually constructing the new object as part of the return (that could happen in the caller depending on how they use the resulting value), but yeah, there's still a lifetime issue if the object holds pointers into itself.

So perhaps "trivial destructive move" is the best we can do, and it's certainly logical and a good match to the C++ rule.

zygoloid commented 1 year ago

I think there are two different non-in-place optimizations here, and they have different constraints:

1) If, for a type T, a value created by a value binding can outlive the object that it is bound to, then a function can return a value representation for type T in registers. This seems like it should generally be possible if (1) the destructor for T is trivial (and so cannot invalidate any invariants of the information that the value representation describes), and (2) the value representation doesn't hold any handles to state whose lifetime is ending when the function returns (for example, pointers to the same object or to other objects that might be on the stack). The move operation is not relevant for this optimization.

2) If, for a type T, an object can be moved to a different address in memory, then a function can return an object representation for type T in registers, by pulling it out of memory in the callee and putting it back in memory in the caller. This should generally possible if the type supports a "move from one location in memory to another" operation that is either just moving the bits around, or if for example it supports some "prepare to be relocated" and "clean up after relocation" operations.

I suspect optimization (1) is more interesting, because it avoids going through memory unnecessarily, but it probably happens a lot less often because, for most interesting user-defined types, the value representation will be a pointer, which means it always contains a handle to state whose lifetime is ending when the function returns. In the case where the type's object and value representation are the same, these two optimizations seem like they might collapse to the same thing, but I think they actually don't, because optimization (2) is materializing an object in the caller (including, for example, a requirement to call a destructor) and optimization (1) is not.

josh11b commented 1 year ago
  • If, for a type T, a value created by a value binding can outlive the object that it is bound to, then a function can return a value representation for type T in registers. This seems like it should generally be possible if (1) the destructor for T is trivial (and so cannot invalidate any invariants of the information that the value representation describes), and (2) the value representation doesn't hold any handles to state whose lifetime is ending when the function returns (for example, pointers to the same object or to other objects that might be on the stack). The move operation is not relevant for this optimization.

@zygoloid could you explain (1), maybe with an example? It also isn't obvious that you can construct an initializing expression from a value representation. It would be particularly useful if you give an example that can be returned due to (1) but not (2). I feel like criteria for (2) captures "nothing cares about the address of this object", which is what we really care about, but maybe there is another reason a type won't have a trivial/bit-copy move? My assumption was that "having trivial/bit-copy move" did not mean that we would materialized a representation in memory in order to then copy the bits -- I thought that criteria was sufficient to be able to do everything in registers.

zygoloid commented 1 year ago

It's a little hard to get to an example where optimization (1) is possible but (2) is not, because I think it requires the type to have a custom value representation, which isn't something I have a lot of experience with and examples of.

Here's a contrived example: suppose we have a type NumberAsText whose object representation is an (i32, String), where the string holds a cache of the formatted string representation of the number, and the value representation is just an i32.

Optimization (2) isn't possible for this type. It's not trivially relocatable, because the object representation contains a String, which is not trivially relocatable (assuming an SSO string representation).

But optimization (1) is possible. A value binding for this type can outlive the object that it binds to, because an i32 value doesn't refer to the storage from which it was created, so we can return this type in registers as an i32. (It's not completely clear whether the optimization is a good thing here; we're losing the cached string in the process, but that's something the type author opted into by excluding it from the value representation.)

github-actions[bot] commented 10 months ago

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time.

This issue is labeled inactive because the last activity was over 90 days ago.