Closed peterhuene closed 1 year ago
I've opened this issue to both document and discuss how best to integrate cargo-component
with a component registry, which I've already begun a refactoring to implement.
I'm open to any feedback regarding this design, especially feedback that improves the DX of cargo-component
for Rust developers implementing WebAssembly components (this is my primary concern).
After a little bike-shedding in this issue, I'll turn this into a PR for proper review.
It might be worthwhile to update or refer to the glossary over in https://github.com/bytecodealliance/SIG-Registries/blob/main/glossary.md
Agreed! I think it'd be best to bike shed here a little before pushing a glossary update upstream (if necessary), but I will definitely do so.
There's certainly plenty to dig in on. Nice write-up! The addition of component
type lines up extremely well with the proposed world
and interface
types.
Could wit-bindgen
take a component and generate bindings only from that component? I think this would essentially require versions to be embedded in the component. If that were the case, would this change your design?
Something we can do with components but very few other languages can do, is to (potentially) have the ability to supply multiple versions of a dependency. It might be worth sketching out how we think that will work here (but not necessarily implement it for the MVP).
Could wit-bindgen take a component and generate bindings only from that component? I think this would essentially require versions to be embedded in the component. If that were the case, would this change your design?
I'm not entirely sure I understand; if you mean when implementing component B
and you depend on component A
, then yes, it should generate bindings directly from component A
's type information for building component B
.
A
may have its imports and exports referencing other packages in the registry (or another one) via their url
fields (these are versioned in some fashion; additionally, I believe Luke recently proposed specifying version information in the imports/exports themselves), so it should be possible for bindings generation to be transitive without having to specify dependencies of dependencies.
Does that make sense or did I completely misunderstand your question?
One part I might bikeshed a bit is the usage of component
as that's what a world
is intended to be used for, and I think could work in this situation? For example I'm envisioning something like:
[package.metadata.component]
implements = "fastly.compute-at-edge"
[package.metadata.component.dependencies]
fastly = "0.1"
wasi = "1.0"
where the implements
directive points to a world
, in this case the package fastly
would have a document compute-at-edge.wit
which would have a default world something-or-other { ... }
which would be used to implement this component. The fastly
package would be consulted relative to dependencies
. Similarly a local version could be done with:
[package.metadata.component]
implements = "my-package.my-world"
[package.metadata.component.dependencies]
fastly = "1.0"
my-package = { path = "wit" }
where you would then write:
// wit/my-world.wit
default world the-world-i-am-implementing {
include fastly.compute-at-edge // hypothetical syntax not currently in the WIT spec yet
// ...
}
This would allow avoiding component.wit
entirely if the world
is defined on the registry (which many likely will be) and additionally allows doing your own local thing if you'd like too
I like that as it allows for pointing at someone else's world and using it as is without having to author a wit
file at all.
cargo component new
would probably still spit out a wit
file to define the component's world to implement by default, but I imagine adding an --implements
option to point it at a world from the registry.
And if they want to define a custom world based on other worlds / interfaces, they'd use the hybrid approach in your example.
I'll update the design shortly.
I'll add that we probably can't support the foo = "1.2.3"
syntax for dependencies (à la crates.io) as it's intended for the packages to be namespaced in a component registry.
Perhaps a shorthand of <name> = "<package-id>:<version>"
which is semantically equivalent to <name> = { version = "<version>", package = "<package-id>" }
?
Edit: updated the design to use the shorthand form.
So I was thinking about this more when walking my dog. The problem with not having a wit
file in some cases is that I would imagine that nearly all of the time users will, in fact, need one.
Say for example I'm authoring a component targeting a particular world (e.g. fastly.compute-at-edge
). This is fine as it informs the bindings for what the component can import and what it must export and, if the component being implemented is self-contained and doesn't have any additional exports, that's all it would need.
But now I, as the component author, want to make use of another component (which hopefully in a thriving component ecosystem is commonplace). To do so, I now need to know to create a wit
file, point Cargo.toml
at a world within it, and not forget to include
the original world being implemented before adding an import for the component dependency into the world.
That seems complicated (albeit perhaps automatable by cargo component add
). In whatever design we approach, I definitely want to make depending on another component as simple as possible.
Perhaps, as a possible solution, if the world being implemented isn't from a local path, any dependency on a component package translates to an import merged in with the world being implemented? Is that too magical or no?
Another thing to consider is that a local wit
definition will also be desired for describing additional exports (like the greet
function in my example above) from the component being implemented.
I still think there is something to the clear demarcation of version requirements go in Cargo.toml
and the component type is described by wit
alone, but I'm not married to it.
I think though component dependencies may work out differently? I think of a world as "I can't work unless you give me this and I give you that" whereas a component dependency is lower level where it's an internal implementation detail that possibly could be bundled within the component itself (or imported for a registry scenario). In that sense would component dependencies necessarily show up in WIT files?
I don't view component dependencies as internal implementation details when authoring a component at all.
From the perspective of the tooling that produces component packages (e.g. cargo-component
), where all imports and exports (excepting the "default" export) of interfaces from the component are in terms of instances for maximal composability, there needs to be a way to describe more imports or less exports that what an implemented world requires.
It's not the domain of a tool like cargo-component
to produce a component that is a subtype of a particular world. It should be possible to produce a component package that imports (even a subset of) interfaces from a world and from a component dependency with that dependency represented as an instance import in the authored component's type.
To me, it's the domain of a composition tool to resolve the abstract (instance) imports to either:
url
.Similarly with exports, it should be possible for a composition tool to specify what gets exported from the resulting component, allowing a target world to be satisfied by an export coming from one internal instance and another export coming from a different internal instance.
Composition is where the fun stuff happens. That's the tool that would enforce that the resulting component adheres to a target world, if used, ensuring the resulting component type is a subtype of the world.
These composed components could, of course, also be published to a registry, expressing their dependencies on other components as component imports (thereby "locking" the composition to that particular implementation).
My understanding of this might be off, but I think we would want to produce component packages from the language tooling as abstract as possible while also expressing dependencies on other components directly (via instance imports).
To put this all another way, I want the language-specific component tooling to be able to express:
An import of an interface from a definition package; this is when the component author isn't concerned with the implementation of the import at all and will provide no hints to anyone wishing to compose with this component later on as to what implementation to use for this dependency.
An import of an interface from a component package; this is when the component author still would like an instance import to enable linking against an alternative, subtype-compatible implementation, but is providing a hint to anyone wishing to compose with this component later on as to what implementation to use to satisfy the import by default.
This is mostly why I omitted "worlds" from the design above as it felt like, to me, that this particular tooling isn't as concerned with worlds other than perhaps as a shorthand for explicitly importing and exporting the interfaces specified by a world, e.g.:
component {
include wasi.command // in theory a world describing commands
import foo: bar
greet: func(string)
}
vs.
component {
import wasi-fs: wasi.fs // imported by `wasi.command`
...
execute: func() // in theory whatever exports for `wasi.command`
import foo: bar
greet: func(string)
}
In this example, the produced component isn't a subtype of the wasi.command
world and therefore would need to be composed with something that erases the foo
import before executing in such a world.
Great writeup and great discussion!
First, just as a naming bikeshed, could we call this unified (interface+world) package a "Wit" package (instead of a "definition" package)? My reasoning here is that "definition" is used pretty broadly in core wasm and the component model and in general refers to anything that can be inserted into an index space, thus covering types, functions, instances, etc. Even "components" and "modules" are definitions, so in a sense all packages are "definition" packages.
As a second bikeshed, to be consistent with the "targets"/"supports" terminology suggested here and here, perhaps we could say "targets" instead of "implements" in Cargo.toml? In theory it's unambiguous in the context of a component-producer toolchain that when we say "implement" we mean "target", but I was thinking it might be nice to just be consistent in the use of these two terms instead of "implements".
Lastly, I agree with Alex that we would ideally stick with world
instead of introducing a new component
concept given that, iiuc, the two concepts are structurally identical and would be resolved the same way. (But is that right, or is there some difference I'm missing?)
But I also agree with Peter that, when I am building a reusable (unlocked) component for publication and reuse, I want to create a component with the variety of kinds of imports that Peter list and let some downstream consumer of this component figure out exact dependency versions, virtualizations, etc when building the final (locked) component I want to execute. My impression of how this works is that, when I'm authoring a component, I start with a "base" world that I'm "targeting", and then I add extra dependencies (on both interfaces and component implementations) in my Cargo.toml
that get joined (⊔) with my "base" world to produce a "derived" world that maps (as defined in component-model/#141) to the final component type of my (unlocked) component.
As a side thought on the interaction between worlds and dependencies: in a normal unlocked component, dependencies on other components appear as instance
imports (as Peter said) and thus we're not yet fixing what their imports look like; figuring that out is the job of a downstream depsolver tool. An interesting possibility here is that while my component A targets world WA, my dependency B may target world WB which is not a subtype of WA and thus a trivial depsolve would end up targeting WA⊔WB. If I actually want to run the composite AB on world WA, I'll need to virtualize the stuff in WB that's not in WA, but that's also the job of further downstream (virtualization) tooling. But maybe I don't want to have to virtualize, so when building A initially, I want to ask the build tool to reject any dependencies that fall outside of WA (so I'm guaranteed the depsolve will produce a composite that runs in WA). Or, maybe I want to preemptively virtualize my dependency's imports (independent of the broader depsolved component DAG), so that I'm importing the dependency as a component
(not instance
) and creating an instance
privately. I could imagine these both as advanced options added at some point.
👍 on simply "wit" package, using "targets" for world terminology when authoring components, and also not using a wit syntax specifically for defining a component type when that's really what a world is.
However, it's still not clear to me that we would want to have what world is being targeted in Cargo.toml
given it's likely users will want a wit description of the component's world anyway to define both its own interface(s) and explicitly specifying how its dependencies are otherwise imported and exported from the component.
It seems like always having a wit file for the component (optionally pointed at by Cargo.toml
, but otherwise defaulted to a particular path) would mean having multiple places where the world dependency could be specified: in a Cargo.toml
and also as a (hypothetical) include
in the wit.
I'd personally like to delegate the entire world definition to wit and let Cargo.toml
just define the version requirements of the dependencies.
After some discussion with the Registry SIG, I think we'll move forward with the ability to describe the component being authored's world in Cargo.toml
with also the ability to easily extract that information out into a separate wit
file referenced from Cargo.toml
using the tooling (cargo component wit
or some such); basically what Alex has proposed above, but perhaps with an additional mechanism for specifying additional imports and exports in TOML.
I think this will strike the right balance between good initial developer experience (i.e. doesn't make wit
an initial learning hurdle to implementing a component that targets a well-defined world) and allowing those that are comfortable with wit to do more advanced descriptions of the components they're authoring.
Thanks everyone for feedback on this. I've put up PR #43 that I hope strikes the right balance between what belongs in Cargo.toml
and when a wit document is needed.
Component Registry Dependencies
Overview
Currently
cargo-component
supports implementing a component by specifying individual imports and exports defined by localwit
documents viaCargo.toml
.At the time
cargo-component
was originally implemented, it was imagined that a component registry might store individual interface definitions as packages, enabling registry dependencies to be specified in aCargo.toml
file as:Therefore at most one interface could be defined in a
wit
document and stored in a component registry "interface" package.As a consequence of this, a
world
stored in a registry would then need to explicitly reference each interface being imported and exported from their individual interface packages; this is certainly not ergonomic and would definitely contribute to an unnecessary proliferation of packages.wit
has since evolved to allow multiple interfaces and worlds to be defined in one or morewit
documents. To facilitate this,wit
now has syntax for using types from other documents.With this more flexible approach to defining interfaces,
cargo-component
needs a new mechanism for specifying dependencies that are stored in a component registry.This document proposes a design for specifying dependencies from a component registry for
cargo-component
.Registry package types
Before discussing the proposed design, it might be useful to discuss some terminology surrounding the types of packages that might be stored in a component registry.
Previously, there was discussion around there being three types of packages in a component registry: an interface package, a world package, and a component package.
Both interface and world packages are simply WebAssembly components
containing only type information; in terms of the component model proposal, the former describes an instance type and the latter describes a component type.
A component package is an implementation of a WebAssembly component; thus it contains type information and executable code.
With the introduction of the
use
syntax, this proposal suggests reducing the types of registry packages to two: a wit package and the aforementioned concept of a component package.A wit package, much like the concepts of interface and world packages, contains only definitions of types, interfaces, and worlds. However, a wit package may store any number of definitions, including defining both interfaces and worlds in the same package.
Design overview
Note:
cargo-component
should support sourcing packages from multiple registries, potentially defaulting to a particular registry instance.This design proposes that
cargo-component
reads only the version requirements for component registry dependencies fromCargo.toml
.An example
Cargo.toml
of a "greeter" component might look like:The short form of a dependency is
<name> = "<package-id>:<version>"
.The general form is
<name> = { version = "<version>", package = "<package-id>" }
wherename
here maps to a package name used incomponent.wit
(see below) andpackage-id
is the qualified identifier of the package in a component registry.Local
wit
documents may still be referenced by using apath
key instead of thepackage
key.In this example,
Cargo.toml
is only specifying that version1.2.3
of thewebassembly/wasi
package and version3.2.1
of themy-org/formatter
package be used.Note that what is used from the packages is not specified in
Cargo.toml
; thus it doesn't describe the world of the component being built in any way.To specify the component's world, a
wit
file with a default name ofcomponent.wit
is used:In this example, the component imports two interfaces: the
fs
interface from a package namedwasi
and the default interface from theformatter
package. The former is used to print the greeting to the console and the latter is used to format the message based on whatever formatter implementation is supplied at runtime.The component will directly export a function named
greet
that will ultimately print a greeting for the given name.Because
cargo-component
resolves dependencies ahead-of-time,wit-parser
only needs to be instructed where to locate thewasi
andformatter
packages to successfully parsecomponent.wit
.Implementation
To implement this approach,
cargo-component
will parseCargo.toml
to figure out what component registry dependencies are required.For consistent builds, it will also consult a lock file (design TBD) for specific versions and signatures of the dependencies to use.
cargo-component
will contact one or more registries to download (or update) the package logs of the dependencies, verify the logs, and resolve the version requirements to specific versions to download and cache locally.It will then parse
./component.wit
(the path can be changed inCargo.toml
, if desired) and providewit-parser
with the paths to the cached package dependencies.From this, the definition of a world will be derived and used to generate the bindings needed to build the component for commands like
cargo component build
.Finally, the resulting component will encode unique
url
s for its imports and exports based on what packages were resolved (and from where) bycargo-component
.Benefits
A few benefits to this approach:
Dependencies are specified similar to how they are specified for Rust crates in
Cargo.toml
: version requirements go inCargo.toml
and what is used from the dependencies is specified in "source code" (component.wit
in this case).wit-parser
does not need to be made registry-aware; packages are resolved to local definitions prior to its invocation.Knowledge of a
component.wit
file is transferable to other implementation languages: it is just awit
document to describe the component being built and therefore other language-specific tooling would likely also use acomponent.wit
file for generating bindings.Tooling for other languages
In addition to Rust, this approach can also be adopted for other language-specific tooling.
For example, in JavaScript, tooling that wraps
npm
could specify dependency version requirements inpackage.json
and use acomponent.wit
file to specify the component type for generating bindings.Even
wit-bindgen
CLI (or a tool wrapping it) could be made registry-aware by sourcing version requirements from a file (bindgen.toml
?) and use acomponent.wit
file to generate bindings for supported languages.