Closed mppf closed 5 years ago
Providing a way to say 'infinite' might be useful, but I don't think Rust-style lifetime annotations make sense for Chapel.
I think "infinite" is somewhat the same as saying "it's a raw pointer" so perhaps the way we'd say that is by using a "raw" pointer type (whatever that is).
We could call it Unsafe(T)
.
The case "copyObjectTree" returns an Owned thing. My understanding is that we need to concern ourselves with the lifetime only of borrowed things. (Correct?)
proc copyObjectTree(...) : owned MyTree {...}
To indicate the lifetime, I propose the syntax borrowed(variable or field reference(s))
. Which indicates the shortest of those variable(s)/field(s)' lifetimes. Gives us an excuse to introduce the borrowed
keyword. :) For example:
proc getGlobalHashtableElement(key) borrowed(globalHashTable) const ref {...}
proc MyType.getElement(key) borrowed(this) { ... }
It is meaningful to specify the lifetime at the variable granularity, rather than just "global" vs. "local scope". That is because even the global variables are destructed in a particular order. So, for example:
var Table1 ...;
var Table2 ...;
proc get1(key) borrowed(Table1) return Table1.get(key);
proc get2(key) borrowed(Table2) return Table2.get(key);
// illegal: adds a value that may be destructed while Table1 is still live
Table1.put(get2(some key));
// OK
Table2.put(get1(some key));
We can write the same code inside a function so all scopes are local.
BTW "infinite" is imprecise. For example, if a global variable "owns" a class pointer, it needs to delete
it upon program shutdown. I assume the deletion is implicit, like for any owner.
By contrast, a global variable of an "unchecked" policy would not do that. The user needs to delete explicitly.
The case "copyObjectTree" returns an Owned thing. My understanding is that we need to concern ourselves with the lifetime only of borrowed things. (Correct?)
proc copyObjectTree(...) : owned MyTree {...}
I think this is a reasonable way to describe that pattern, yes.
If Owned
is the way to say an owned pointer, it would become
proc copyObjectTree(...) : Owned(MyTree) {...}
To indicate the lifetime, I propose the syntax
borrowed(variable or field reference(s))
. Which indicates the shortest of those variable(s)/field(s)' lifetimes. Gives us an excuse to introduce theborrowed
keyword. :) For example:
I think the "excuse to introduce the keyword" is a joke that I don't get.
A reasonable alternative name would be scope
or return scope
(these are closer to D's terms).
Or it could be lifetime
.
The other element you are describing here is to put the description of the return value lifetime near the return type. We could instead put the description of which arguments are available for lifetime inference of the return in each argument.
proc return-scope-borrowed-lifetime MyType.getElement(key) { ... }
Also, another way to write your first example might be this:
proc getGlobalHashtableElement(key) borrowed() const ref {...}
since for the lifetime inference purpose, the minimum lifetime of the empty set is "infinite" lifetime.
We could also instead of marking variables that are borrowed, mark variables that aren't:
proc getGlobalHashtableElement(cannot-return-borrow-of key) const ref {...}
(where obviously we'd come up with a better keyword name).
It is meaningful to specify the lifetime at the variable granularity, rather than just "global" vs. "local scope". That is because even the global variables are destructed in a particular order.
We could do that, but the current analysis does not. Instead it merely concerns itself with what is allocated in what block. I don't know if it would impact the analysis. I'm fairly happy at the moment with this more "coarse grain" approach, though.
BTW "infinite" is imprecise. For example, if a global variable "owns" a class pointer, it needs to
delete
it upon program shutdown. I assume the deletion is implicit, like for any owner.By contrast, a global variable of an "unchecked" policy would not do that. The user needs to delete explicitly.
If a global variable "owns" a class pointer, it would only trivially participate in lifetime inference. Since it "owns" a class pointer, it will be assumed that a borrow from that global variable has at least as short a lifetime as the global variable. It's a fair point though that the global variables are destroyed in a particular order - it's similar to adjusting the checking to pay attention to the order of deletion within a block.
Good alternative options.
I am not excited about using borrow()
to indicate "infinite" lifetime. Even though logically that's what it means, some users may get confused by seeing "borrow" where the returned value is "owned".
If we need to also annotate when a function does / does not "invalidate" references to a particular thing (such as an array's domain resize would invalidate references to the array - or Owned.clear() would invalidate borrows of that Owned), that might also require some sort of annotation. (This would support cases we don't currently check for as described in #8382).
proc getGlobalHashtableElement(key) borrowed(globalHashTable) const ref {...}
Two reasons this might be the right general idea:
I think I'd prefer to use borrowing
but if we end up with a borrow
keyword, it might work in that environment too. Anyway, if such a keyword introduces a list of variables/arguments, the rule is that the function can only return a borrow of one of those arguments (and not some other variable / argument). That allows appropriate checking. On the other side, the compiler will assume the minimum lifetime of the variables/arguments mentioned in that list when inferring the return lifetime.
While we can quibble about the syntax, I agree that this is the ideal place to put the return lifetime. I'm still a fan of &
over borrow
, etc. but I'm not going to die on that hill.
There will be cases when the return value is not strictly a borrow of any variables. Array indexing is one example; returning a field of a record could be another one.
So maybe we want to present this annotation as "the returned thing has the lifetime of one of these:". In which case I would prefer the keyword to be "lifetime", not "borrow". (Even though I suggested "borrow" above.)
Or we can take the position that borrowing will be a common case. Users will not need to know about lifetimes except in weird corner cases. Even if borrowing is indirect, as in array indexing, we could still present it as borrowing-like.
In this case, a "borrow" variant makes a good sense.
Considering the three proposed variants:
There will be cases when the return value is not strictly a borrow of any variables. Array indexing is one example; returning a field of a record could be another one.
As far as the checker is concerned, these are "borrows" from the array/record containing the fields.
Since we've talked about this, several things have changed that might change the design direction.
borrowed
keyword. It might be reasonable to hang the lifetime specification syntax off of it.If we use borrowed
for the lifetime annotation, that might make it harder to write a generic function that accepts/returns both borrows and also owned
etc.
I know of the following interesting patterns:
What syntax can be used to specify these, in case the default behavior is not sufficient?
Here are a few examples where we'd like to be able to specify the lifetime of the return:
Function returning an element of a global hashtable
Here, lifetime inference is likely to assume that the function could return the
key
argument. But, oncekey
is used to find the hashtable element, thekey
is no longer used.Method returning an array element
Function returning an element of an outer local hashtable
The annotation for
innerFunction
might need to indicate lifetime matches an outer variable rather than an argument.