WebAssembly / component-model

Repository for design and specification of the Component Model
Other
935 stars 79 forks source link

Proposal: Union of Worlds #169

Closed Mossaka closed 1 year ago

Mossaka commented 1 year ago

I want to raise this issue to propose a new syntax for the union of Worlds. The primary motivation for this proposal is that we want to be able to form a more compresive world by unionizing / combining multiple worlds together to form a bigger one. Concretly, this is motivated by the wasi-cloud-core proposal, which tries to define a new World that includes all the wasi-cloud worlds as to provide a feature set that allows a serverless / edge function to do state management or message exchanging.

Proposal

This proposal adds a new syntax include <world-path> in World definition. Below are a few examples/sketches showing how the include syntax would work.

Union of two Worlds

// worlds.wit
world my-world-1 {
    import a: self.a
    import b: self.b
    export c: self.c
}

world my-world-2 {
    import foo: self.foo
    import bar: self.bar
    export baz: self.baz
}

world union-my-world {
     include self.my-world-1
     include self.my-world-2
}

is equivalent to

world union-my-world {
    import a: self.a
    import b: self.b
    export c: self.c
    import foo: self.foo
    import bar: self.bar
    export baz: self.baz
}

Include a World that has package and inline imports

Notice that the inline export has been "copied" over to the World that includes this World.

// b.wit
interface b { ... }

// a.wit
interface a { ... }

world my-world-1 {
    import a: self.a 
    import b: pkg.b
    import c: io.c // external package
    export d: interface exp { ... }
}

// union.wit

world union-my-world-1 {
    include pkg.my-world-1
}

is equivalent to

world union-my-world-1 {
    import a: pkg.a
    import b: pkg.b
    import c: io.c
    export d: interface exp { ... }
}

Name Conflicts

This is a more challenging example where the two Worlds being included have name conflicts. The solution to this is to use with syntax and aliasing to avoid conflicts of names. The with syntax is only used to avoid name conflicts for those names that are in conflict with another World. The names that are not in conflicts should stay the same and need not to be specified in the with scope.

// my-world-1.wit
world my-world-1 {
    import a: self.a
    import b: self.b
    export d: self.d
}

// my-world-2.wit
world my-world-2 {
    import a: self.a
    import b: self.b
    export c: self.c
}

// union.wit
world union-my-world {
    include pkg.my-world-1 with { a as a1, b as b1 }
    include pkg.my-world-2
}

is equivalent to:

world union-my-world {
    // resolve conflicts
    import a1: pkg.my-world-1.a
    import b1: pkg.my-world-1.b
    export d: pkg.my-world-1.d

    import a: pkg.my-world-2.a
    import b: pkg.my-world-2.b
    export c: pkg.my-world-2.c
}

De-duplication

When imports and exports of union-ing of multi-worlds have the same definitions, they will be deduplicated.

// a.wit
// b.wit
// c.wit

// my-world-1.wit
world my-world-1 {
    import a: pkg.a
    import b: pkg.b
    export c: pkg.c
}

// my-world-2.wit
world my-world-2 {
    import a: pkg.a
    import b: pkg.b
    export c: pkg.c
}

// union.wit
world union-my-world {
    include pkg.my-world-1
    include pkg.my-world-2
}

is equivalent to 

world union-my-world {
    import a: pkg.a
    import b: pkg.b
    export c: pkg.c
}

Let me know your thoughts on this design! @lukewagner @fibonacci1729 @alexcrichton

alexcrichton commented 1 year ago

What you've outlined looks all good to me, although one more case to consider is type definitions in worlds:

world foo {
    type a = u32;
}

world bar {
    type a = u32;
}

world baz {
    include self.foo;
    include self.bar; // should this error?
}

I think the answer is "yes" it should error which would require the disambiguation syntax you've proposed to resolve.

I also think that https://github.com/WebAssembly/component-model/pull/164 makes the disambiguation syntax nicer here since there's only one namespace to disambiguate and you don't have to specify either imports or exports.

lukewagner commented 1 year ago

@Mossaka Really nice writeup, thanks! This generally looks great to me too.

@alexcrichton I might be missing something here but, in the exact case you give where both the name and structural type are the same, what's the problem with saying "these are the same" and de-duping, so that there is no error? To my understanding, this is analogous to the de-duping of two imports of the same interface.

There is an additional subtlety concerning exports. If we think of include as meaning "I'd like to define a new world that is able to run all components targeting the included world and more", then, if we define include as simply unioning the exports (symmetric to how you've shown include as unioning the imports), then a component targeting the included world won't be a subtype of the unioned world (b/c it won't have any exports that were subsequently added). To properly solve this, we need optional exports, which would let us say: if an export isn't present in all included worlds, it is optional. With this definition of include, a component targeting an included world would indeed be a subtype of the union world. In the short term, I expect we could get away with simply defining include to take the union of exports (just like imports) and having host bindings implicitly interpret all exports as optional. Eventually this stopgap will break down in more-advanced composition scenarios, but I'm hoping those don't arise in the Preview2 timeframe so we don't have to add optional quite yet.

(Btw: lest the asymmetry bother anyone: if we think of the dual operation of world "intersections" (e.g., if I want to build a single component that can successfully run in multiple unrelated worlds), then above rules just get flipped between imports/exports: imports are optional if not in all worlds; exports are simply unioned.)

Mossaka commented 1 year ago

Thank you, @alexcrichton and @lukewagner for your comments!

I also think that https://github.com/WebAssembly/component-model/pull/164 makes the disambiguation syntax nicer here since there's only one namespace to disambiguate and you don't have to specify either imports or exports.

This is really nice!

To my understanding, this is analogous to the de-duping of two imports of the same interface.

I was thinking the same. Maybe it makes more sense for the following type definitions in worlds to be resolved using with

world foo {
    type a = u64;
}

world bar {
    type a = u32;
}

world baz {
    include self.foo with { a as a1}
    include self.bar; 
}

a component targeting the included world won't be a subtype of the unioned world

Ah right, it will break the subtyping rule if we are not careful about union-ing exports. Good point! I agree that for now we can let the host to implicitly treat all exports as optional.

alexcrichton commented 1 year ago

That's true yeah, although the subtyping/equality check is currently only present in the wasm binary validator rather than the WIT parser phase, so I'd be tempted to say that to be conservative as a starting point types are required to be disambiguated and perhaps in the future we can relax the rule for structurally equivalent types.

Otherwise though @Mossaka yeah I think the syntax works the same with similar semantics, it's just that types are something to consider in addition to named interfaces/functions.

Mossaka commented 1 year ago

so I'd be tempted to say that to be conservative as a starting point types are required to be disambiguated and perhaps in the future we can relax the rule for structurally equivalent types.

That makes sense. For now, let's disambiguate type defs even though structurally they are equivalent. In this case, does my de-duplication section still makes sense? It might need to check if the duplicated types have the same structures which is equivalent to subtyping/equality check in WIT parser phase.

alexcrichton commented 1 year ago

Yeah what you mentioned makes sense to me which is to always require explicit renaming on conflicts.

lukewagner commented 1 year ago

I think this is resolved by #174 (but please reopen if there is still more to the proposal to be done).