WebAssembly / component-model

Repository for design and specification of the Component Model
Other
899 stars 75 forks source link

First-class functions/closures as resources #278

Open badeend opened 7 months ago

badeend commented 7 months ago

I was wandering, now that resources/ownership/borrowing/etc are properly part of the component model; In what way are closures not "just" single-method resources? Can they piggy back off of the semantics and restrictions defined for resources?


For example, given the imaginary interface (& syntax):

interface A {
    register-event-handler: func(handler: fn(event));
    map: func(items: list<string>, mapper: borrow<fn(string) -> string>) -> list<string>;
}

Assuming that the following resource types are (automatically?) defined somewhere

resource fn-event { // fn(event)
    call: func(a1: event);
}

resource fn-string-string { // fn(string) -> string
    call: func(a1: string) -> string;
}

how is the first example different than this:

interface A {
    register-event-handler: func(handler: fn-event);
        // Fn is passed as "owned" handle. Ie. it's the responsibility of the
        // implementation of `A` to properly drop the resource/closure.
        // On the flip side, `A` can call `handler` as much as it wants even
        // after `register-event-handler` has returned.

    map: func(items: list<string>, mapper: borrow<fn-string-string>) -> list<string>;
        // Fn is borrowed. Ie. `mapper` may only be called during the
        // invocation of `map`, and must be dropped before returning.
}

?


By (re/ab)using resource semantics, this would keep function "references" in the realm of acyclic dependencies a.k.a. "no global GC required", right?

badeend commented 7 months ago

Partially answering my own question:

lukewagner commented 7 months ago

The main challenge with first-class functions is the lifetime of the closure state. If a consumer component can import an interface exported by a producer component that contains a first-class function as a parameter, then the consumer's closure state ends up being "owned" by the producer component, which very quickly leads to producer/consumer ownership cycles of the type that I was just, coincidentally, describing yesterday here.

That being said, based on the discussions near the end of this issue, there is a new idea for a new restricted form of callback, which is a callback that is scoped to a call (with the same rules as how a borrow handle is scoped to a single call). With this restriction, the lifetime cyclicy issues are resolved: the callback's closure state must definitely be kept alive for the duration of the call, after which the caller definitely owns it (thus there's no producer-to-consumer edge). And in a Preview 3 timeframe, with addition of async functions where the lifetime of an (async) call can last a long time and overlap with other calls, scoped callbacks become even more flexible.

Based on this, we could say that the problem with general (unscoped) first-class functions is that they have indeterminate lifetime (b/c agreed that reentrancy is not necessarily a problem if opted into). Also, if you squint, the indeterminacy of callback lifetimes is at the heart of the problems described here that motivate "structured concurrency".

badeend commented 7 months ago

The the child-handles branch looks very interesting! I haven't thought it that much through, but it seems like a potential solution indeed.

At the same time, I'm having a hard time envisioning how pleasant the experience will be for developers authoring components in basically any language that isn't Rust. For many developers, own and borrow are already novel concepts. Throwing in a 2x3 matrix of scope/"lifetime" annotations in the mix too could be a bit much.

As an alternative for the cycle problem, maybe we could introduce a general purpose weak<$R> handle type? And leave it up to the implementing component to determine which specific lifetime scheme to use. It can even incorporate common (but more advanced) lifetime annotations in the long run:

syntax lifetime promise
weak<$R> No mechanically enforced promise. Just RTFM.
weak<$R, call> Weak ref is guaranteed to exist at least throughout the duration of the function call.
weak<$R, parent p> Weak ref is guaranteed to exist at least as long as the resource of parameter p.

The annotation-less variant is definitely not as rigorous as your proposal, though.

lukewagner commented 7 months ago

Throwing in a 2x3 matrix of scope/"lifetime" annotations in the mix too could be a bit much.

From a dynamic language perspective (or a statically-typed language that doesn't have a good way to express lifetimes in types), I think we can get away with just a single value that represents all handles, reflect these different scopes as different dynamic rules for when a handle value is "live" and when it is "dead" and will throw (or otherwise fail) when accessed. This is already a common pattern in dynamic language APIs whenever "lifecycles" or dynamic ownership transfer is involved (e.g., see postMessage() with an ArrayBuffer in the transfer list). Thus, adding new scopes just adds more expressivity for describing (in a mechanizable manner) the different lifecycles that naturally arise in interfaces (e.g., see all occurrences of "child" in comments in wasi:http/types, which is a relationship that is currently weakly enforced since language bindings don't understand comments).