WebAssembly / component-model

Repository for design and specification of the Component Model
Other
914 stars 78 forks source link

Consider renaming `world` to `component` #274

Open vwkd opened 8 months ago

vwkd commented 8 months ago

In Wit, a "world" corresponds to a component. This took me too long to grasp, because "world" is a strange word for it. "World" has a notion of expanse, plurality, whole. Intuitively, it doesn't relate at all to component. In a sense, it means the exact opposite: whole vs part.

IIUC, the name "world" originated by taking the host perspective and looking at different hosts as different "worlds" that guest components can target. While sensible in this limited context, taking the perspective of the host at the boundary of the component universe, comes at the cost of clarity for components that fill that universe.

It seems to me the host itself can be seen as a component that the guest component plugs into. This built-in host component does have one special exception: not all its exports need to be fulfilled (the guest component can import only some of them).

It seems easier to me to look at the host boundary as a special component, than to look at every component as a special host boundary.

Consider renaming world to component.

For example,

package myorg:http@0.0.1

// ...

component proxy {
    export incoming-handler
    import outgoing-handler
}
rylev commented 8 months ago

Chiming in to provide perspective that isn't from the spec authors.

I believe this proposal only makes sense when considered from the guest's perspective as the guest is itself a component. However, this doesn't work from the host's perspective since the host's world is a superset of any given guest's imports and a subset of the guest component's exports. In other words, from the host's perspective a world is all of the imports it is capable of supplying and all the exports it is capable of calling. Any given guest component will likely only have a subset of these imports and potentially a superset of the exports.

A world is "all of the things I understand" and from this perspective I think world is sensible choice.

That being said, "world" is far from standard terminology so its usage will present a substantial educational challenge so I personally am sympathetic to the desire to find terminology that is more immediately obvious.

lukewagner commented 8 months ago

Thanks for raising the question and the thoughtful points. I'm happy to discuss alternatives, since I've heard other folks point out that "world" confused them too (although, with Preview 2 almost done, I don't think we want to be making any major naming changes in the short term).

Agreed that "world" isn't a standard term at all. However, I'm not sure a standard term exists that describes both the exports and imports of a component and makes sense from (to Ryan's point) both the host and guest perspective; in most existing module systems, component models, service-oriented-architectures that I'm aware of, the "signature" of a {module, component, service} only ever covers the union of exports, thus it seems like we need a new term.

Personally, what I like about "world" is:

For reference, "profile" was the first idea, but it's taken now and sounds painfully generic in any case.

yordis commented 8 months ago

I'd like to share my perspective as someone for whom English is a second language.

My first language is Spanish, and despite having spoken English for over a decade, I still speak with an accent and occasionally struggle with pronunciation of "world".

I rather see "Planet" πŸ˜„ Anything that is much easier to say and type**** I do not know how many times I typed wordl ... in my career.

aschrijver commented 8 months ago

Realm?

vwkd commented 8 months ago

Appreciate the constructive feedback and highlighting the host perspective which the initial post had mentioned only shortly.

It seems to me the host itself can be seen as a component that the guest component plugs into. This built-in host component does have one special exception: not all its exports need to be fulfilled (the guest component can import only some of them).

It seems easier to me to look at the host boundary as a special component, than to look at every component as a special host boundary.

That said, "X world" could still be a useful synonym to refer to different built-in host components, particularly when a host supports multiple.

rossberg commented 8 months ago

Chiming in as well, I'm also not super-happy with the terminology choices made by wit:

I wish this could still be consolidated. It's the one thing where the component model went wrong IMHO.

@vwkd:

This built-in host component does have one special exception: not all its exports need to be fulfilled (the guest component can import only some of them).

AFAICS, this is just the same subtyping that applies to component types as well, so no difference there.

lukewagner commented 8 months ago

Thanks Andreas, that's useful perspective. "Signature" isn't a terrible alternative to "world" although it does lose the "describing a set of hosts" connotation that "world" has and seems to be one of the main uses of worlds in practice (e.g., in WASI). I'd be curious to hear what other folks who've been using WIT in practice for a while now think of "signature". Certainly when talking about a specific component, it's more natural to say "what's the signature of that component?" than "what's the world targeted by that component?".

I do disagree on the topic of interface, though: I think our use of interface matches the common use and expectations and doesn't seem to cause any ambiguity in practice that I've seen. Also, there's not a clear "thing" in the Component Model that a WIT interface corresponds to; it varies:

  1. When an interface is defined in the abstract, the interface gets compiled to a component type that imports all the used types and then aliases these to define a nested instance type. So in this context an interface is a component type, but not in a way that describes a concrete component.
  2. When an interface is imported or exported by a world, the uses get resolved to aliases of preceding concrete imports and so the interface turns directly into an instance type (no wrapping component type).

Thus, depending on context, an interface turns into either a component type or an instance type, so neither of those names are appropriate. Moreover, a "signature" doesn't seem right either b/c it doesn't describe the complete signature of anything, only a fragment of a real component's complete signature (which has both imports and exports). And "interfaces" being fragments of a component's complete signature matches older component models and also the interface keyword in various mainstream programming languages. So while I get what you're saying in the abstract, in practice, afaics, I think our use of interface in WIT matches precedent and expectations; or at least I can't think of a better alternative that doesn't confuse more than it clarifies.

rossberg commented 8 months ago

Certainly when talking about a specific component, it's more natural to say "what's the signature of that component?" than "what's the world targeted by that component?".

Yes, in particular, since a component signature does not, in fact, describe "the world targeted by it", but the exact inverse of that, doesn't it? That is, its imports are the required exports of the world it "targets".

I think our use of interface matches the common use and expectations and doesn't seem to cause any ambiguity in practice that I've seen.

I'm curious: where else is it used that way? People speak of modules having interfaces. Wikipedia defines an interface as the boundary between subsystems, not as a subsystem. In the context of modular programming, it explains that an interface "expresses the elements provided and required by a module". And that's also the common informal usage AFAICT.

Thus, depending on context, an interface turns into either a component type or an instance type, so neither of those names are appropriate.

Yes, and I was almost mentioning that as another feature that further exacerbates the problem! Context-dependent meaning is not a great starting point for a future-proof design, especially not in a higher-order setting like the underlying component model.

For example, one simple feature missing from wit is the ability to name the signature of an instance. Using the underlying component model terminology, I want to be able to write

instance-type I {
   ...  // list of things
}

component IAdaptor {
   import instance In : I
   export instance Out : I
}

There are various patterns in modular programming where you need the same module signatures in multiple places, so I am rather surprised that wit does not support this currently, but requires repeating all of I each time.

Now, the problem is, with the current conflation of types vs definitions, it isn't even clear how you would add this basic feature. If "interface" already is the instance, how would you call and define an "interface"'s interface? And how would you reconcile such an extension with the ambiguous semantic overloading of interface declarations?

lukewagner commented 8 months ago

I'm curious: where else is it used that way?

In common component systems, "interfaces" are required or provided by a component. In the classic UML diagrams you see everywhere (e.g. here), they show up as these little lollipops that stick out of the component but they're not considered the interface "of" any particular component -- they're distinct entities required or provided by a component. And in OOPLs, of course there is the interface keyword which, while semantically a different thing (implying virtual dispatch etc), fills the same basic role of "that thing you write to define a named collection of functionality".

Using the underlying component model terminology, I want to be able to write

instance-type I {
   ...  // list of things
}
component IAdaptor {
   import instance In : I
   export instance Out : I
}

So if you write this today:

package ns:p;
interface i {
  foo: func();
}
world w {
  import i;
  export i;
}

this will resolve to the following component type:

(type $w (component
  (import "ns:p/i" (instance (export "foo" (func))))
  (export "ns:p/i" (instance (export "foo" (func))))
))

and that's what you want by default when doing anything with (WASI) standardized or vendor-specific interfaces where the interface name (ns:p/i in this case) is significant (i.e., baked into the host).

Now let's say you want achieve the following component type using "plain names" that are arbitrary (i.e., not baked into the host and thus manually supplied by the client, say via WebAssembly.instantiate(importObject)):

type $w (component
  (import "whatever" (instance (export "foo" (func))))
  (export "luke" (instance (export "foo" (func))))
))

Now you can write this in WIT which will resolve to the above component type:

world w {
  import whatever: interface { foo: func() }
  export luke: interface { foo: func() }
}

but now, to your original point, we're forced to duplicate the interface block. This doesn't seem like a huge problem in practice (most imports are named interfaces so you want the interface name, as above) so we haven't yet added support for how to do this, but one can imagine various WIT extensions that would allow interface { foo: func() } to be defined out-of-line once and then reused inline twice.

squillace commented 8 months ago

Chiming in from the old "COM" world of interfaces in a different century. Without addressing the question of the format and the abilities WIT has, I do want to weigh in on some history that definitely should NOT guide anyone directly but which may be taken as useful. OLE/COM always treated the component as the implementation of a (set of) interfaces that were described IDL or by ITypeLib (a compilation of that to an encapsulated format recognized by Visual Basic). As such, either side of an implementation was a "component" and no one was required to describe what a "collection of interface impls" was in the sense of "world". A component was what it implemented, and it need not (per Luke's comments above) be restricted to any particular "grouping" even the grouping of the caller (what we would have called the "guest" component).

In Microsoft langauge, which again no one should be taking seriously a priori, we needed words to explain "standard groupings" of interfaces, especially for use with things like .NET and Visual Studio code, because there intelligibility at first glance was the critical concept. We used "profile" for these things, so for example there was a "micro" profile and a "server" profile and so on, which described a collection of standard capabilities, essentially. Now I don't know if "profile" is a better or worse grouping name; but I do submit that while I wouldn't habitually use "world", functionally it signals the same set of meanings for me -- once I adjust. There are many worlds out there, and each is unique in its own way, yet we might usefully describe one type of world as a "Class M planet", to use a reference from science fiction.

Not sure if this helps or hinders the original discussion; it doesn't bear I don't think on the "instance" portion of the comments.

Mossaka commented 7 months ago

Based on the discussion, I slightly modified the definition of "World" in WIT.md. Instead of saying a world is a complete description of imports and exports of a component, a world actually encapsulates the full spectrum of of interaction capabiltiies of a component. The crucial difference is that a component may have more or less imports/exports, but its interaction capabiltiies are defined by the world it targets to. #283

rossberg commented 7 months ago

In common component systems, "interfaces" are required or provided by a component. In the classic UML diagrams you see everywhere (e.g. here), they show up as these little lollipops that stick out of the component but they're not considered the interface "of" any particular component

Isn't that just a difference in wording? An interface "provided" by a component is an the interface "of" the component, so I don't think this contradicts what I said. What's conceptually more relevant is that UML maintains that modules/components and their interfaces are separate entities and distinguished notions. Wit is the outlier in that regard, AFAICT.

So if you write this today: [...]

Hm, to be honest, this makes it even more confusing and the role of interface declarations even more overloaded/ambiguous than what I reckoned.

In the component model, the name/source of an ex/import vs its interface are independent. That is natural and makes total sense, as far as I'm concerned – like a record field or function parameter vs its type. The wit syntax on the other hand appears to mangle these aspects together in a seemingly ad-hoc and special-cased fashion that I struggle to find a useful intuition for.

Why the impedance mismatch with the underlying component model, and how is it not counter-productive?

If it at least was merely sugar over a canonical notation that also existed, then one could more easily see how everything hangs together. But not if it is the only notation.

lukewagner commented 7 months ago

What's conceptually more relevant is that UML maintains that modules/components and their interfaces are separate entities and distinguished notions. Wit is the outlier in that regard, AFAICT.

WIT and UML seem quite aligned on distinguishing components and interfaces as separate entities, so I don't see how WIT is an outlier.

If it at least was merely sugar over a canonical notation that also existed, then one could more easily see how everything hangs together. But not if it is the only notation.

The current state of WIT is a subset of the final state of WIT in that WIT cannot express every possible component type so clearly more has to be added to WIT in the future and, in the future, we may indeed think of the current syntax as being sugar for something more primitive. In particular, a WIT "interface" definition logically contains both an interfacename (a string) and a component type (a structural type which can be converted to an instance type by substitution of use parameters with what they resolved to in a particular world), so maybe interface is indeed sugar for 2 more-basic WIT features in the future. But interfaces seem (thus far) to be what you mostly want to define and use and talk about in practice, so I think it makes sense to lead with them.

ggoodman commented 6 months ago

As a newcomer to WASM Components and the WIT IDL, this post is my attempt at articulating what I learned and what I think about it.

I usually like to start at use-cases and then work backwards towards implementations. To that end, here are the different use-cases for the WIT IDL that I came up with. I've tried to express these in a mutually-exclusive and collectively exhaustive (MECE) way.

WASM Component use-cases

  1. As a host, what capabilities do I offer to guests?
  2. As a host, what requirements do I impose on guests?
  3. As a guest, what requirements must my host fulfill?
  4. As a guest, what capabilities do I expose to the host (and possibly to other guests)?

Now, you'll note that I didn't mention Component or Runtime above. It's my understanding that a Runtime will always be acting as a host and will never be a guest. On the other hand, Components may act as a host, but will always be guests. This dual nature of Components is what gives them such fascinating compositional properties.

I also didn't make any references to the WIT IDL syntax or constructs. I'm trying to understand the use-cases in the abstract before thinking about how they're expressed in the IDL.

Contracts and Capabilities

Given what we've explored above, I think it is helpful to note that a Runtime implicitly must express a Guest Contract. A runtime can't run a Component without an agreed-upon entrypoint / reactor.

A Component may express a Host Contract if it requires functionality from its host. It may express a Guest Contract if it is capable of acting as a host for guest Components. That being said, it's perfectly reasonable for a Component to simply target a host's Host Contract without having it's own Host or Guest Contracts. A hypothetical example of this is given below.

But while the Host Contract and Guest Contract help organize our thoughts about vertical composition, I think they are missing the larger ecosystem concern. We are still missing a concept to express the fourth (4) use-case from our original list, the Guest Capabilities.

I think it's reasonable to expect folks to start building out an ecosystem of re-usable WASM Components that provide specific capabilities. In that way, we can think of such Components as traditional libraries / modules / packages / etc. These Components encapsulate some logic and expose that through their API as Exported Capabilities. This is functionality that is being put out there for anyone to consume, with no requirements being made. There is no contract. They are provided as is and it is up to consumers to decide how they should be wired up. They may, of course, have a Host Contract if they require capabilities from their host or have a Guest Contract if they can act as a host for other Components.

So with that, I think we've got our third required concept, bringing us to the following list:

Wrap up

So with that huge wall of text out of the way, what does this say about the current WIT IDL? I think my main take-away is that the world paradigm is trying to express all three of the above concepts simultaneously and ends up not quite nailing it on any of them.

I'm sharing this in hopes that it provides a different way of thinking about the problem and opens up new angles for refining the WIT IDL.

sunfishcode commented 5 months ago

@yordis

My first language is Spanish, and despite having spoken English for over a decade, I still speak with an accent and occasionally struggle with pronunciation of "world".

I rather see "Planet" πŸ˜„ Anything that is much easier to say and type**** I do not know how many times I typed wordl ... in my career.

This difficulty is unfortunate, however the word "world" does more vividly convey a virtual nature than the word "planet".

Interestingly, Spanish has a similar distinction, between "mundo" and "planeta". For example, the phrase "todo el mundo" translates to English literally as "everyone in the world", but it's common for people to mean just "everyone", in some implied context. In contrast, the phrase "todo el planeta" tends to literally mean "everyone on the planet".

@ggoodman

Thanks for the extensive write-up! The use-case driven approach is helpful; I find it illuminating to look closely at the use cases:

WASM Component use-cases

  1. As a host, what capabilities do I offer to guests?
  2. As a host, what requirements do I impose on guests?
  3. As a guest, what requirements must my host fulfill?
  4. As a guest, what capabilities do I expose to the host (and possibly to other guests)?

This is a good separation of the space into exclusive concerns, however I don't think it completely captures the relationships in play. Compare that list with this one:

  1. As a host developer, what guest requirements do I need to be able to satisfy, to support a given ecosystem of guests?
  2. As a host developer, what host requirements do I need to limit myself to, to support a given ecosystem of guests?
  3. As a guest developer, what guest requirements do I need to be able to uphold, to ensure my code will run on a given family of hosts?
  4. As a guest developer, what host requirements do I need to limit myself to, to ensure my code will run on a given family of hosts?

When we think about portable ecosystems, with families of hosts that can all run ecosystems of guests, and guests that can all run on all these hosts, these are the shapes that emerge.

As a guest developer, if I want to run my code on some hosts, it's my responsibility to meet two contracts at the same time: I must provide all the exports that the hosts need, and I must limit my imports to what the hosts can support. If I neglect either, my code won't run.

And similarly, as a host developer, if I want to be able to support an ecosystem of guests, it's my responsibility meet two contracts at the same time: I must provide the exports that guests collectively need, and I must also limit my dependencies on guests' exports to the set that guests in the ecosystem actually provide. If I neglect either, I won't support the ecosystem.

Worlds represent these pairings of contracts.

  • Exported Capabilities The set of capabilities a Component offers to consumers with no assertions made about how they are used. An example of this might be a pseudo-random number generator Component. It might expose a seed() method that may (or may not) be called and a random() method to get the next pseudo-random number.

A pseudo-random number generator world for that component would have seed and random exports, and (I assume) no imports. Perhaps that world is only ever used to describe that one component, in this spirit of Exported Capabilities. Perhaps the developer doesn't even want to write a Wit file for it, and instead just has tooling infer the world from the code. This world doesn't need to have any relationship with the world of the host that the full application will run in. There are no assertions about how the component may be used.

ggoodman commented 5 months ago

Hi @sunfishcode, thanks for the great response. I think your take on the use-cases is probably a better representation of where the community and tech is going. I think that finding this MECE set of use-cases could be a pretty powerful tool for onboarding developers into the ecosystem.

I would imagine that--if successfully find the exhaustive set--a developer would see themselves in one of more of the use-cases. Having answers for what a world is in the context of those use-cases can really help with making sense of it.

I've taken a spin through wit-bindgen and I think that it is an interesting case study. Specifically, when using a tool like that, it seems like it is hard to express the intent of acting as a host vs acting as a guest (or both). Also, it forces the creation of these one-off disposable worlds for library-style guests. That makes me wonder if the tool might be strictly adapted to the different use-cases or whether it is the input to the tool.

sunfishcode commented 5 months ago

@ggoodman Indeed, wit-bindgen is a low-level tool, and several projects are adapting it for developer-oriented use cases, such as cargo component, jco, componentize-py, and more to come.