Affine types / ownership system

microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

https://www.typescriptlang.org

Apache License 2.0

100.23k stars 12.39k forks source link

Affine types / ownership system #16148

Open Gozala opened 7 years ago

Gozala commented 7 years ago

It would be amazing if typescript could gain types with an ownership semantics as it could provide a very powerful tool against bugs caused by mutations. Idea is inspired by Rust's owneship system and adapted to match ts / js language. Idea is to provide built-in types for which move semantics would be tracked. In the example below I'll refer to that type as Own<t>

Ensures that there is exactly one binding to any given Own<t>, if it is assigned to another binding type of the former binding should become Moved<t> and type of the new binding should be Own<t>.

const array:Own<number[]> = [1, 2, 3]
const array2 = array // => <Own<number[]>> [1, 2, 3]
const first = array[0] <- [TS]: `Moved<number[]>` has no index signature.

A similar thing should happens if function is passed Own<t>, binding in the caller scope will get type Moved<t>:

function take(array: Own<number[]>) {
    // What happens here isn’t important.
}

let array:Own<number[]> = [1, 2, 3];

take(array);

array[0] // <- [TS]: `Moved<number[]>` has no index signature.

Rust also has also borrowing semantics which may be also worse pursuing. But for now I'm going to leave it out until I get some initial feedback.

Why

Typescript already provides tools to deal with the same class of issues using readonly feature, which is nice but mostly useful with immutable data structures that do impose an overhead. With an ownership system it would be possible to author performance critical code without an overhead while still maintaining safety guarantees from the type system.

HerringtonDarkholme commented 7 years ago

Hi @Gozala. Welcome to TypeScript.

While ownership system and linear type are interesting language feature, I recommend you to read Design Goal of TypeScript. Item 1,3,7 are related here.

Linear Type system seems to Exactly mimic the design of existing languages, alas, Rust.

Also it seems to require a more "correct" type system, so correct that it harms productivity for ordinary JavaScript/TypeScript developers who never hear of ownership/RAII. Own<T> in a garbage collected language acts no more than a stricter Object.freeze, though.

IMHO any programmers other than Rust and C++ will be surprised at why a variable suddenly becomes unavailable after use. TypeScript shouldn't require users to grok std::unique_ptr<T> before using.

TypeScript has already sacrificed safety for productivity, for example, bivariance. I wonder if linear type system would ever on TS' roadmap.

Gozala commented 7 years ago

Hi @Gozala. Welcome to TypeScript.

Thanks!

While ownership system and linear type are interesting language feature, I recommend you to read Design Goal of TypeScript. Item 1,3,7 are related here.

I did read them and to be honest I don't see any conflict here. To be specific:

Statically identify constructs that are likely to be errors

That is primary goal of this proposal & specifically it would help statically identify instances where mutation may take place without a concessus - explicitly expresing which part of code is responsible for state changes at specific code paths.

Impose no runtime overhead on emitted programs.

I don't really know why would you imagine runtime overhead here. If anything linear types provide a lot of the same guarantees as immutability but without the memory overhead that immutability imposes.

Preserve runtime behavior of all JavaScript code.

Again not sure why would you assume that. Mostly linear types just ensure that there's a single "actor" (in a broad sense not to confuse with an actor model) performing mutation.

Also it seems to require a more "correct" type system, so correct that it harms productivity for ordinary JavaScript/TypeScript developers who never hear of ownership/RAII. Own in a garbage collected language acts no more than a stricter Object.freeze, though.

I really don't see analogy with freeze here. It's by no means becomes immutable. I do repeat myself, but only thing type checker will be doing different is ensuring that once reference to binding is passed that binding is no longer used (naively assuming type of that binding can be marked as moved so any usage will be an error).

IMHO any programmers other than Rust and C++ will be surprised at why a variable suddenly becomes unavailable after use. TypeScript shouldn't require users to grok std::unique_ptr before using.

I don't think that's what is being proposed. Also I would assume if you do use something like Own<t> you would read a docs for it to understand what does it provide.

To be clear by no means I am proposing to make ownership system a default behavior for all bindings, that would indeed be absurd.

Instead I am proposing adding a built-in type similar to ReadonlyArray that can provide safety against data races in performance critical code where immutability is unacceptable option due to mentioned overhead

Now whether ownership system is bringing enough value to the ts audience to justify an effort or if it is even compatible with a current design is a different matter, also a reason I opened this issue to find that out.

Edit: fixed some typos

HerringtonDarkholme commented 7 years ago

I mean the item 1, 3, 7 in non-goal section.

Exactly mimic the design of existing languages.

Rust

Apply a sound or "provably correct" type system.

Own<T> is not provably correct. But certainly a lot of typing complexity is added. Just consider how assignability rule should apply to a T and Own<T> and Move<T>, and how it applies to flow sensitive typing, structural typing and union type/intersection type.

Introduce behaviour that is likely to surprise users.

Why a variable of Own<T> will be changed to Move<T>. Can you provide a language example that a variable type is changed after variable usage, other than Rust? TS does have flow sensitive typing where variable type is refined after predication like instanceof but using variable as argument has nothing to do with narrowing type.

Gozala commented 7 years ago

I mean the item 1, 3, 7 in non-goal section.
Exactly mimic the design of existing languages.
Rust

It's not specific to Rust it's just substructural typing which may not be as mainstream as structural typing which is what TS is, and in that case all of the TS does mimic other languages of that family.

Apply a sound or "provably correct" type system.

Own is not provably correct. But certainly a lot of typing complexity is added. Just consider how assignability rule should apply to a T and Own and Move, and how it applies to flow sensitive typing, structural typing and union type/intersection type.

Those are all good questions and answers may be simple or complex depending what direction design is taken. That being said I'd rather not get into weeds here, until there is some initiative to implement this.

Introduce behaviour that is likely to surprise users.

Why a variable of Own will be changed to Move. Can you provide a language example that a variable type is changed after variable usage, other than Rust?

I really don't think explaining semantics of linear types & it's derivations or languages that implement them makes much sense here. There are better resources available for this I'd suggest to star with Substructural type system wiki page, also worth noting that provided language list is far from complete.

That being said, I do agree that substructural types are not quite as mainstream yet & Rust and Idris are probably two names that stand out. That does imply:

There will be fewer people familiar with it's semantics than of structural types.
It maybe not mainstream enough for typescript.

As of "likely to surprise", well I'd argue typescript does have enough of that even for users coming from structural type systems that typescript seems to provide. So where line is drawn I do not know, but I think provided utility play a role in determining how much of surprise factor is acceptable.

TS does have flow sensitive typing where variable type is refined after predication like instanceof but using variable as argument has nothing to do with narrowing type.

I'm not sure what is the point you're trying to make here. Sure TS does not do this (should I say yet 😄), but hey I we would not be having this conversation if it did. I also in no way tried to imply that adding this feature will be free in terms of design or implementation work. In fact my main concern was that effort it would require may very well out-weight the benefit it's going to provide. All I can tell I'd be happy to help this effort if it happens.

P.S.: Don't take it a wrong way please, but most of your commentary seems to be questioning design of linear type system itself and some assumptions indicate you may be misunderstanding a big picture. I'd really encourage you to look more into this subject, you may just find out that it fits JS better than structural types, it certainly does IMO, structural types work great in pure functional languages (that I really wish JS was more of), but with level of mutability and imperativeness you find in JS you end up loosing many of the benefits. Linear types on the other hand seem to allow all of that.

RyanCavanaugh commented 7 years ago

It's important to interpret the non-goals in the right direction. What we mean is "You should have feature X because it's in language Y and Y is a good language" is a bad argument, and so is "You shouldn't do feature X because it is in language Y". Basically a feature should support itself in terms of value, not mimicry.

I'm not sure Rust would have owernship semantics if it weren't for their constraint of having a sound memory management model. It certainly has positive side effects on mutability analysis but that doesn't seem to be the primary objective.

There are a lot of other JavaScript patterns for handling the scenario of an object that has exactly one mutation site; closures come to mind as the most obvious solution. Without some extremely compelling use cases that would justify something as complex as this, I don't see it being a good fit for TypeScript.

interphx commented 7 years ago

I think there is a use case for ownership system in performance-sensitive projects (games, simulations, heavy mobile apps). I've encountered the following issue while working with space partitioning data structures, creating multidimensional views of arrays, etc.

Consider these two classes:

class DataProcessor {
    protected data: any[];
    protected temp: any[] = [undefined, undefined, undefined];

    process() {
        // this.data can be changed here
    }

    // This function can be called many times per second
    // by different callers, so we avoid recreating the array
    getFirstItems(): any[] {
        var temp = this.temp,
            data = this.data;

        temp[0] = data[0];
        temp[1] = data[1];
        temp[2] = data[2];
        return temp;
    }
}

class CallingSite {
    protected dataProcessor: DataProcessor;
    protected items: any[];

    badMethod() {
        // We may expect the received items array to remain
        // unchanged in the future, but this is not the case
        this.items = this.dataProcessor.getFirstItems();
    }

    goodMethod1() {
        var items = this.dataProcessor.getFirstItems();
        // ...do some processing with items...
        // This is ok, because we discard the reference to 
        // the items array after the method ends.
        // It does not leave the scope.
    }

    goodMethod2() {
         // This is ok, because we explicitly copy the items
        this.items = this.dataProcessor.getFirstItems().slice(0);
    }
}

Although the particular example is contrived, it demonstrates a pattern that I often see in performance-critical projects: maintaining an object and repopulating it instead of creating a new one every time.

The problem: it is impossible for the type system to indicate that the object the caller received may change and become invalid in the future. If the calling site wishes to preserve it, they need to explicitly call dataProcessor.getFirstItems().slice(0), otherwise they can only process and discard it.

Casting the returned object to an immutable interface does not solve this: it only prevents the calling site from modifying it.

While I doubt that a complete affine or linear type system is a valid goal for TypeScript, supporting cases like this would make performance-sensitive code a lot more safe.

Further issues I see:

What to do with closures (probably only allow capturing into a closure if the reference to that closure is also discarded after the function ends)
What to do if the caller wishes to actually preserve a reference to the changing object (we need some way to make this explicit)

ghost commented 5 years ago

Affine/linear types would be extremely cool to have for state machines (e.g. chainable/fluent APIs where prior states cannot be reused), not just memory management/immutability/etc.

Productivity shouldn't be an issue if it was opt-in (like in @Gozala's example).

tjpalmer commented 5 years ago

Further, a basic linear type system doesn't have to be perfectly sound. Easy type assertions could circumvent for convenience, and that would be ok for TypeScript. I think the ability to express type changes on certain operations would be very handy.

emdash commented 4 years ago

Yeah, I think this would be an amazing thing to see in TypeScript. It would even nudge me to favor TS over Flow for certain applications, because it would be better at eliminating a certain class of bugs.

If it helps, think of it as "Resource Management", or "Resource Types" rather than the inscrutable "substructural." Because the primary use is to model a fixed set of resources that can be shared among any number of owners. But in concrete terms, what they do is let you catch the following kinds of errors at compile time, rather than run-time (and hence eliminating the run-time overhead of a library implementation).

"Tried to use some single-use x after an operation consumed it"
"Forgot to properly release x"
"Forgot to actually use x, when failing to do so would be Very Bad".
"Tried to make an unauthorized copy or alias of x, when doing so would be Very Bad".

Because TypeScript objects don't have destructors, and most types have reference semantics, it's also hard to imagine pulling this off as a library. You'd have to implement it as a plugin, or additional tooling.

And any library solution would impose run-time overhead, and be limited to catching bugs at run-time. But as a language feature, something like Unique<T> could theoretically be type-erased back to T prior to code generation. And could even allow for additional optimizations down the road.

It doesn't necessarily have to be handled as types "changing" as I saw in some earlier comments. You could potentially provide a set of primitives which enable certain static guarantees.

singleton - statically enforce (there are a few approaches I can think of) that only a single instance can ever exist. There are a few different approaches with different trade-offs. The easiest to explain is that a "singleton" is just a class where all members are implicitly static, and the constructor runs exactly once. All references to some singleton S refer to the same object.
move_only - statically enforce that only one binding to some T can exist at a time. I.e. if you pass a move_only` to a function, you can't use that binding again until you've assigned to it. But callees are allowed to return the T back to you, or pass it on.
single_use - statically enforce that some T is used at most once. There needs to some scheme to determine when a value is considered "consumed".
must_use - Make sure the value gets consumed somewhere.

Note that some combinations of these types make sense: single_use + must_use compose to give you "use exactly once". A type could theoretically be single_use without being move_only. It makes perfect sense to have a move_only singleton, or even a move_only must_use singleton.

Closures do complicate things, but I suspect there conservative solutions that won't break the language while still providing some utility.

dima-starosud commented 4 years ago

TypeScript tracks when variable changes its type:

class X { constructor(public x: string) { } };
class Y { constructor(public y: string) { } }

type Z = X | Y;
var z: Z;

z = new X('x');
console.log(z.x); // compiles
console.log(z.y); // error TS2339: Property 'y' does not exist on type 'X'.

z = new Y('y');
console.log(z.y); // compiles
console.log(z.x); // error TS2339: Property 'x' does not exist on type 'Y'.

This can be used for modeling linear typing discipline etc.

elslooo commented 4 years ago

I think this is a really interesting feature. For anyone who's familiar with Rust, its benefits are clear: there is an entire class of bugs that this will prevent (even apart from safe memory management, see @khoomeister). In the future, it might also yield new opportunities for optimizations.

I think we can also agree that ownership has a steep learning curve and probably won't be useful in the majority(?) of projects TS is typically used for (web frontend).

So: what would be needed to get ownership into TS in a pragmatic fashion? If I understand correctly, the borrow checker in Rust is a separate stage that comes after type checking etc. I think it is certainly viable to do the same with TS: insert a new stage after type checking and before emitting. That's something we can do right now without any changes to the TS repo itself, as a separate NPM package that calls and enhances the TS compiler.

There are a few things that need to be done (though this list is certainly incomplete):

Find a non-intrusive way to tell the compiler when ownership transfers. This is complicated by the fact that there are many ways in which a value may move. Let's look at function calls first:
```
fn borrow_arg(arg: &mut Foobar);
fn consume_arg(arg: Foobar);
```
Borrowing is more or less what currently happens in Typescript (except that there is no formal checking for concurrent borrows).
```
function borrowArg(arg: Foobar);
function consumeArg(arg: ???);
```
Emit destructors for moved values (note that const-checking happens before this transformation).
```
  const arg = new Foobar();
  consumeArg(arg);
+ arg = undefined;
```
This of course does not generate a compile-time error yet but it does provide an initial guard towards ownership.
Emit compile-time errors (i.e. add ts.Diagnostics) in case arg is accessed after it is moved before it is assigned a new value.

If we constrain ourselves to mutable borrowing (i.e. no notion of immutability) and ownership, there are three major difficulties we need to address:

How do we request borrow / move semantics from the compiler without adapting the scanner / parser (i.e. no new syntax)?
How do we maintain compatibility with existing code?
Can we get anything working without proper lifetimes? We probably need to relax some of the borrowing semantics (it doesn't have to be perfect to provide some level of benefit, per @tjpalmer).

Regarding some potential solutions posted above: changing the type from Own<T> to Moved<T> (per @Gozala) is not something I expect to work well. I think from a semantic perspective, the type should not change after use (per @HerringtonDarkholme). We should use the separate borrow checking stage to emit custom errors instead of relying on the type checking stage. However, I agree that using Own<T> to signal the compiler that we want borrow checking is probably the best idea to maintain backwards compatibility (i.e. all non-Own<T> values should behave as if there's no borrow checking).

emdash commented 4 years ago

I'm actually not sure that merely tracking "ordinary" uses of a value (appearing in a statement or expression) is meaningful in typescript. True sub-structural type systems would count any appearance of the value as a "use", and that would complicate a lot of common things. For example, is a method invocation one or two "uses"?

I think it makes more sense to explicitly specify which operations consume a value. Let's keep in mind the actual applications:

enforce that a user can't forget to release some arbitrary resource.
manage a static set of resources (perhaps corresponding to physical resources).
modeling stateful APIs (to call step_3, you need a Step2Result, to call step_2, you need a Step1Result.
require inspecting a return value that might be an error.

Here's a relatively simple approach: First, we need types to explicitly annotate sub-structural values. These would be recognized by the compiler:

class Resource<T> {
  value: T;
  drop();
}
class MustUse<T> extends Resource<T> { move(): MustUse<T> };
class SingleUse<T> extends Resource<T> { move(): SingleUse<T> };

The global restrictions on identifiers holding a Resource:

they must live on the stack or in the global scope (can't be members)
they must be declared with const, and not with let or var (to prevent reassignment, which is assumed by what follows).
they cannot escape the current scope without a call to move().

Now we can assume that a sub-structural value can only come from the following sources:

a valid constructor call
a valid move() call
a function or method returning a sub-structural type
a parameter declared as a sub-structural type
an catch block handling a sub-structural type

Moreover, once you have a sub-structural value, you must satisfy the requirements for that type:

MustUse: require at least one call to move() or exactly one call to drop() on a given code path.
SingleUse: require exactly one call to move() or drop() on a given code path.

And finally, move() is required in order to:

pass a sub-structural value as a parameter
return a sub-structural value from a function
throw a sub-structural value as an exception
claim a sub-structural value from within a closure

Issues and Edge Cases:

We ideally want to be reminded to call call file.close(), not file.drop(). But it may not be practical to handle this.
Resource as a member
Closures
- what happens if closures escape? are called multiple times? are never called?
- closures could be subject to the drop() / move() discipline of the types they capture.
- e.g. if a closure captures a SingleUse, then it must move() or drop() the value, and the closure itself then must be called exactly once (like rust's FnOnce).
Singletons: they seem related, but it's more subtle than I thought at first.

maurizi commented 4 years ago

For an example use case of where affine types could prove very useful, they would be excellent for enforcing the behavior of transferable objects sent to Worker threads:

An optional array of Transferable objects to transfer ownership of. If the ownership of an object is transferred, it becomes unusable (neutered) in the context it was sent from and becomes available only to the worker it was sent to.

emdash commented 4 years ago

That's a good example! Thanks for that!

One could probably also come up with examples involving futures and async/await.

On Thu, Sep 10, 2020, 08:39 Michael Maurizi notifications@github.com wrote:

For an example use case of where affine types could prove very useful, they would be excellent for enforcing the behavior of transferable objects sent to Worker threads https://developer.mozilla.org/en-US/docs/Web/API/Worker/postMessage:

An optional array of Transferable https://developer.mozilla.org/en-US/docs/Web/API/Transferable objects to transfer ownership of. If the ownership of an object is transferred, it becomes unusable (neutered) in the context it was sent from and becomes available only to the worker it was sent to.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/microsoft/TypeScript/issues/16148#issuecomment-690374664, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAC3W72D6ZMSTBHIRSSNJDSFDXK3ANCNFSM4DNKJIGA .

nasso commented 1 year ago

Being able to model move-semantics would make it possible to safely interact with Web Assembly modules, which typically rely on dynamic allocations a lot.

I suggest being able to do something like this (maybe by special-casing the syntax T is void in this context):

type Foo = { ptr: number };

function drop(foo: Foo): asserts foo is void {
  // (do something here that invalidates `foo` somehow)
  foo.ptr = 0;
}

const foo: Foo = { ptr: 4 };

drop(foo); // ok
drop(foo); // error!

If using the void type to represent values which shouldn't/cannot be used, we could introduce a special moved type.

This would allow us to represent very basic "ownership" semantics like the transferable objects for Worker.postMessage(), as well as provide extra type-safety when talking with a Web Assembly module (or even just a native system interface). A project like wasm-bindgen would greatly benefit from this.

hazae41 commented 1 year ago

@nasso

I think I made something great for disposing resources (e.g. freeing WebAssembly pointers)

https://github.com/hazae41/box

Only the owner of the box can dispose the resource

import { Box } from "@hazae41/box"

class D {
  [Symbol.dispose]() { 
    console.log("it should only happen once")
  }
}

/**
 * At the end of this block, D will only be disposed once
 */
{
  using box = new Box(new D())
  using box2 = box.move()
}

It will throw if you unwrap a moved box

import { Box } from "@hazae41/box"

class D {
  [Symbol.dispose]() { 
    console.log("it should only happen once")
  }
}

const box = new Box(new D())
const box2 = box.move()

/**
 * This will throw because box no longer owns the resource
 */
const inner = box.unwrap()

/**
 * This will work because box2 owns the resource
 */
const inner2 = box2.unwrap()