dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.06k stars 1.56k forks source link

feature request: support linear types #55657

Open aran opened 4 months ago

aran commented 4 months ago

Consider supporting linear types. Wild proposal for a reasonably major direction. Didn't see it discussed here or in the Google Group. Even something simple like an '@noCopy' annotation would be useful.

A few basic points:

Lots of languages have bolted on linear typing or linear-like features. In Dart's case, it might make sense to do at the analyzer level at first and expose piecemeal features driven by the same underlying analysis such as something like @noCopy annotation, etc. Then a real design for the type system for Dart 5 or whatever could subsume those features.

lrhn commented 4 months ago

TL:DR: Interesting thought experiment. I don't see this happening as a Dart language feature. It will be far too intrusive for the type system. So your suggestion of something tracked only in the analyzer might be more viable, even if not as sound.

If linear types are types whose variables are linear variables, meaning that they can/must be read only once, then it's probably something Dart can do. To some extent.

The underlying wish is to avoid aliasing, that a given "linear object" only ever has one reference. (Or at most one reference, if we ever want to GC it.)

Let's start with linear variables. A linear variable is a variable whose value must be used (read) precisely once. (We can make it affine instead, and say it must be read at most once, then you can leave the value to rot if you want to.) That maps somewhat well to Dart's definite-assignment analysis, if we consider a linear variable being read as clearing the variable, making it unassigned. The rules for linear variables are similar to those for non-late final local variables:

That suggests a modifier that replaces final, perhaps linear int x;. (We can then consider having a late linear variable, where the check is done at runtime if it's only potentially assigned/unassigned in the static analysis.)

There are some design issues around the "must be read once" rule. It means you cannot return when a linear variable is potentially assigned. But for full consistency, you also cannot throw then, and we definitely cannot prevent that. But you can read and ignore the value, so void kill(linear Object? value) { value; /* Read once */} is a way to ignore the value. If that's not allowed, then it means that we're not restricting reads, we're restricting uses, and just reading and discarding is not a use. So what is a use? Assigning to another (linear) variable, including passing as argument. Calling a method? (Maybe, maybe not. Probably, but a method gets a reference to the object as this, and may not be restricted to how it uses that reference.)

That's probably why you'd want linear types, to make restrictions on all occurrences, not just variables. Linear types which are types where every occurrence is linear (must be used once, aka. never aliased), and all variables of that type are linear.

Let's say we declare linear class Foo {}, which means that every occurrence of Foo must be used precisely (or at most) once. If assigned to a variable, the variable must be linear (or is inferred/promoted to linear from that point and until it's definitely read or overwritten).

That doesn't fit that well into the Dart type system.

So, most likely, we'll have to have linear T being a type for each type T, with normalization:

The type linear T is a super-type of T, and if S \<: T then linear S \<: linear T.

I think the proper supertype is the most consistent model, and it means that all existing generics, which have at most Object? as bound, cannot be used with linear types. They should not just be automatically upgraded to linear Object? as bound, because they might not be using their type-parameter-typed values in a linear way.

Then we'll have to write new linear collections. If that even makes sense.

A linear array is a sequence of linear variables, which are cleared when they are read. The API must make it possible to detect whether a position holds a value. That may be impossible if we take the linear property literally. Let's assume a LinearList<T> has list locations of type (linear T)?. Then it may be possible to check whether that value is null, without "using" the linear value. (A == null doesn't call operator==).

More likely we'll need to consider calling methods on a linear type to not count as a use. Or some other allowance for locally keeping multiple references to the same object, as long as at most one of them survives at a point where the reference is passed to someone else.

You can read a linear variable containing a value more than once, as long as all you use it for is calling methods on it. Those methods must then not use this for anything but calling methods. They cannot store this in any way. Or they can store it locally, but not across any invocation that also has a reference to the value.

Now we're getting there:

The "not passing to functions" annoys me. If a linear variable is passed as argument, we don't know whether the function will hold on to it or not. If it can (which likely means allowing the variable to be captured in a closure), then it's possible to hold on to the value indefinitely. It would be nice to have a way to say that a variable is only valid for the duration of the call, and cannot be captured. I guess the traditional way to do that is to have the function return the value (or another value) and then keep working with that. It's enforced linear by actual linearization. But it doesn't work for passing this, because this is already shared with the caller of the member, so we need to ensure that it isn't aliased at all, and goes away when the method call returns. So we cannot pass this, or any closure containing this to a helper function, which is problematic. Maybe we do need a marker on parameters that is stricter than linear, which means that the value can only be used during the call. It can then be passed to other functions that promise the same, and passing a linear variable to such a function is safe, you automatically get it back when the function returns. (For async functions, when the future completes. Such async functions can probably only be written using async syntax. There is no way we can ensure the life-span of a variable passed through a number of Future.then calls.)

All in all, there might be a possible design that can provide some notion of linearity of values.

aran commented 4 months ago

Thanks for considering this so thoughtfully. I am convinced the full language feature is of unrealistic scope, but also convinced there's got to be some way for dart-analyzer to help careful programmers & FFI users if not performance.

One note is whether you want to go all the way and allow flexible coupling of linearity to existing types, e.g. linear class Foo, or restrict it to the safe subset of orthogonal to types, and assign linearity only to variables or values, e.g. @useResult @noCopy Foo x;, where the actual underlying type of x according to the analyzer would be something like {classicType: Class('Foo'), linearity: Linearity.Linear}.

I was imagining something closer to the latter because of its suitability for incremental, annotation-driven usage.

Also worth mentioning on Or at most one reference, if we ever want to GC it., there's already a Dart/Flutter pattern that could be elaborated for declaring the end of a value's lifetime, which would be to formalize.dispose(). Something that eventually calls dispose exactly once could be inferred or declared @usesResult, for example.

lrhn commented 4 months ago

Moved back to SDK repo and made an anlyzer issue. (They can then consider whether it should be a lint or not.)