Open yorickpeterse opened 4 months ago
In theory something like this would work: when type-checking a method body, we set some sort of "borrowed" flag for type parameters whenever they're borrowed, either directly or by passing it to an argument of another method that borrows it. When we then assign a uni T
to such a type parameter, we disallow this at the call site.
The problem with this approach is that it requires at least two passes over all method bodies: one to set the flag for direct borrows (e.g. through ref x
expressions), and one to set the flag for indirect borrows. I'm also not sure this analysis would always be complete.
Here's another way this can be triggered: if you have some uni T
and do that_value.some_field.iter
, the returned Iter
produces values of type ref Something
, allowing you to violate the uniqueness constraints by storing those ref
values somewhere.
Another similar case: we allow pattern matching against e.g. Option[uni ref Something]
, which then lets you introduce a borrow by binding the uni ref Something
to a match variable:
match something {
case Some(v) -> {
move_unique_value_to_another_process(unique_value)
do_something_with_the_uni_ref_we_still_have(v)
}
...
}
To prevent this from happening we have to disallow binding uni ref T
and uni mut T
values to match variables, similar to how you can't assign them to locals.
To see how many places there are where we explicitly use ref x
/ mut x
with x
being a type parameter, I modified the compiler as follows:
diff --git a/compiler/src/type_check/expressions.rs b/compiler/src/type_check/expressions.rs
index 6573bb59..31567c34 100644
--- a/compiler/src/type_check/expressions.rs
+++ b/compiler/src/type_check/expressions.rs
@@ -3069,6 +3069,16 @@ impl<'a> CheckMethodBody<'a> {
) -> TypeRef {
let expr = self.expression(&mut node.value, scope);
+ // TODO: remove
+ if expr.is_type_parameter(self.db()) {
+ self.state.diagnostics.warn(
+ DiagnosticId::InvalidType,
+ "ref with type parameter that might violate uni T",
+ self.file(),
+ node.location.clone(),
+ );
+ }
+
if !expr.allow_as_ref(self.db()) {
self.state.diagnostics.error(
DiagnosticId::InvalidType,
@@ -3112,6 +3122,16 @@ impl<'a> CheckMethodBody<'a> {
let expr = self.expression(&mut node.value, scope);
+ // TODO: remove
+ if expr.is_type_parameter(self.db()) {
+ self.state.diagnostics.warn(
+ DiagnosticId::InvalidType,
+ "mut with type parameter that might violate uni T",
+ self.file(),
+ node.location.clone(),
+ );
+ }
+
if !expr.allow_as_mut(self.db()) {
self.state.diagnostics.error(
DiagnosticId::InvalidType,
Running the test suite then produces the following warnings:
src/std/array.inko:135:16 warning(invalid-type): ref with type parameter that might violate uni T
src/std/array.inko:635:15 warning(invalid-type): ref with type parameter that might violate uni T
src/std/array.inko:726:15 warning(invalid-type): mut with type parameter that might violate uni T
src/std/iter.inko:275:20 warning(invalid-type): ref with type parameter that might violate uni T
I was expected a lot more warnings, so perhaps requiring the equivalent of T: ref
to allow borrowing might not be that bad after all.
A challenge here is when we deal with closures and captures. Take this code for example:
import std.stdio (STDOUT)
import std.string (ToString)
class Person {
let @name: String
}
impl ToString for Person {
fn pub to_string -> String {
@name
}
}
fn example[T: ToString](value: T) {
fn {
let out = STDOUT.new
out.print(value.to_string)
nil
}
.call
}
class async Main {
fn async main {
let alice = recover Person(name: 'Alice')
example(alice)
}
}
Here the closure captures value
by borrowing, violating the uniqueness constraints of alice
in the process. Requiring T: ref
just to be able to capture it is likely going to make working with closures and generics an absolute pain in the ass, as pretty much every such case will need T: ref
and through that forbid the use of uni T
values.
Pony's approach to this is to introduce a variety of reference capabilities and ways to convert between them. However, the resulting complexity of this makes the language difficult to use, which is likely why it never really caught on.
We could potentially remove the need for explicit T: mut
, T: move
, and T: ref
through inference. The rules would (roughly) be as follows:
Given a generic type T
:
ref
, used as the value for a ref
expressionmut
, used as the value for a mut
expression, is the receiver of an fn mut
method, or is captured by an fn
closure, it's inferred as T: mut
fn move
, it's inferred as T: move
Closures here complicate things: since they may mutate the captured data we can't infer T: ref
and instead must infer T: mut
, which may be annoying. The actual inference in turn would require multiple passes over all methods, as inferring the rules for method B may affect its caller A.
If we go down the path of inference, we can avoid using multiple passes using an approach like this:
We maintain a map of Type Parameter => [Type Parameters]
where the key is a type parameter used in an argument, and the values the type parameters passed to that argument. When inferring rules for a type parameter that's a key, we propagate those to the type parameters in the value array. So given code like this:
fn foo[A](a: A) {
bar(a)
}
fn bar[B](b: B) {
baz(b)
}
fn baz[C](c: C) {
mut c
}
The mapping would be:
C => [B]
B => [A]
When processing mut c
we note we're operating on C
and flag it as C: mut
. We then look it up in the map and find [B]
. We then iterate over those values, flag them accordingly, and repeat this process until we run out of work.
The downside is that this won't work reliable when dealing with type parameters inside generic types, such as this:
fn foo[A](a: A) {
bar(a)
}
fn bar[B](b: B) {
baz([b])
}
fn baz[C](c: Array[C]) {
mut c.pop.get
}
Here the issue is that we don't pass B
directly to c
. This means we'd have to traverse the type graph and build the mapping as we go (e.g. as part of type checking).
The other approach is to process methods multiple times and record method dependencies. That is, when we encounter a call to a method we've yet to process, we store the caller as a dependency of the callee (e.g. bar => [foo]
in the above example). When we finish processing, we check for any methods that depend on it and reschedule them, removing them from the dependency list in the process. This requires some form of caching such that we can skip work in a rescheduled method (e.g. foo
) that we've already performed.
I read through Functional Ownership through Fractional Uniqueness hoping it would perhaps provide some extra insight, but alas it felt like a paper with a lot of words but not a lot of actually useful content, though perhaps I just don't grok it.
Also read through Flexible recovery of uniqueness and immutability, but it basically just describes the Pony model using different terms and some extra fluff.
Please describe the bug
When using generic code, it's possible to borrow a generic type parameter that ends up being assigned a
uni T
value, which then allows the borrow to outlive the unique value and violate the uniqueness constraints.Please list the exact steps necessary to reproduce the bug
Consider this code:
Here
ex
ends up being of typeExample[uni Array[Int]]
, with theborrow
field storing aref Array[Int]
, violating the constraints. We can then exploit that usingex.value := Option.None
, giving us anOption[uni Array[Int]]
while theExample
value still stores a borrow to theuni Array[Int]
.I'm not sure yet what the solution here is. If we introduce something similar to the
mut
constraint for type parameters to allow unique values, then we basically end up not being able to use them anywhere or we won't be able to use borrows. Basically it ends up being a massive pain.Operating system
Fedora 40
Inko version
main
Rust version
1.77.2