WebAssembly / tool-conventions

Conventions supporting interoperatibility between tools working with WebAssembly.
Artistic License 2.0
301 stars 65 forks source link

Enable generating OpenContainers annotations from Wasm Components #230

Open yoshuawuyts opened 2 months ago

yoshuawuyts commented 2 months ago

One of the design principles behind the Wasm OCI Artifact Layout is that it operates as a thin wrapper around Wasm Component binaries. Ideally this would mean that it is possible to decode an OCI image to a Wasm Component, and re-encode it back as OCI (roundtrip) without losing any information.

OCI images support a standard set of annotations for metadata, used by registries such as GitHub Container Registry and Azure Container Registry in their respective interfaces. These annotations are documented as part of the OpenContainers Annotation Spec. This specification contains metadata such as the date/time when the image was created, the license the image has, who the image was published by, and who the image was published by.

This data seems very useful to provide, and people are starting to provide that data already today. I would like to propose we establish tooling conventions for how to encode this data in custom sections inside of Wasm binaries today. That way language toolchains can directly encode that metadata as part of the binaries they produce. And all Wasm-specific OCI tooling has to do, is take that metadata from the components and encode that as OpenContainers annotations.

Before diving into any concrete proposal for which custom sections we might want to add and how we'd encode those - I wanted to raise this issue to put feelers out for how people might feel about this. I saw folks were generally positive about https://github.com/WebAssembly/tool-conventions/issues/141 which proposed adding a section on documentation, though the issue hasn't seen any activity for a while. If people are generally optimistic about the what I'm proposing here, I'd be happy to open a PR with initial wording for, say, SPDX license identifiers encoded in a custom section to get the ball rolling on this.

Thanks!

yoshuawuyts commented 2 months ago

Oh and to clarify: I'm not proposing we follow the OpenContainers spec to the letter here. We can and probably want to deviate in some places about how things are encoded and named. For one: I wouldn't want our custom sections to all start with org.opencontainers.image.

The higher-order bit of what I'm proposing here is definitely compatibility - so have a way to map certain custom sections in Wasm to the OpenContainers Annotations spec, and back. We can probably also be more strict in custom sections for Wasm Components than the OpenContainers spec is. For example: guarantee that any version identifier is semver-compliant, rather than semver-optional.

endocrimes commented 2 months ago

With both my CNCF wasm-wg lead hat, and my personal hat, I'd love to see compatibility here and that feels like a really positive direction. I probably am less against prefixing org.opencontainers where their standards are the ones the data should follow are than many tho - mostly to avoid some NIH/wheel reinventing, and avoiding complex mapping rules.

lann commented 2 months ago

There is a custom section definition for certain kinds of registry metadata: https://docs.rs/wasm-metadata/0.214.0/wasm_metadata/struct.RegistryMetadata.html

thomastaylor312 commented 2 months ago

Sorry I missed the conversation today, but I am definitely in favor of using these annotations. However, I am in strong favor of what @endocrimes was leaning towards. I would rather keep the current org.opencontainers prefixing for maximum compatibility. I know that it might look weird to someone who doesn't know why those are used, but it does avoid any mapping issues while not precluding the possibility of adding our own custom org.ba prefixes down the line. If we want to change it, I'd rather map down the line and add/modify any fields that aren't working for us than having to handle that now.

As for using RegistryMetadata, I don't think that information was based on any specific standards (let me know if I am wrong though) and was more of a "best guess" as it were. The reason I like the idea of the open containers annotations is that they were built to be cross language/tooling since containers could be built with applications in all sorts of languages. That maps pretty well to what wasm enables and it is a good starting point. Not to mention that it is a widely accepted set of annotations that gets us easy compatibility right out of the gate.