Open yoshuawuyts opened 2 months ago
Oh and to clarify: I'm not proposing we follow the OpenContainers spec to the letter here. We can and probably want to deviate in some places about how things are encoded and named. For one: I wouldn't want our custom sections to all start with org.opencontainers.image
.
The higher-order bit of what I'm proposing here is definitely compatibility - so have a way to map certain custom sections in Wasm to the OpenContainers Annotations spec, and back. We can probably also be more strict in custom sections for Wasm Components than the OpenContainers spec is. For example: guarantee that any version identifier is semver-compliant, rather than semver-optional.
With both my CNCF wasm-wg lead hat, and my personal hat, I'd love to see compatibility here and that feels like a really positive direction. I probably am less against prefixing org.opencontainers
where their standards are the ones the data should follow are than many tho - mostly to avoid some NIH/wheel reinventing, and avoiding complex mapping rules.
There is a custom section definition for certain kinds of registry metadata: https://docs.rs/wasm-metadata/0.214.0/wasm_metadata/struct.RegistryMetadata.html
Sorry I missed the conversation today, but I am definitely in favor of using these annotations. However, I am in strong favor of what @endocrimes was leaning towards. I would rather keep the current org.opencontainers prefixing for maximum compatibility. I know that it might look weird to someone who doesn't know why those are used, but it does avoid any mapping issues while not precluding the possibility of adding our own custom org.ba
prefixes down the line. If we want to change it, I'd rather map down the line and add/modify any fields that aren't working for us than having to handle that now.
As for using RegistryMetadata
, I don't think that information was based on any specific standards (let me know if I am wrong though) and was more of a "best guess" as it were. The reason I like the idea of the open containers annotations is that they were built to be cross language/tooling since containers could be built with applications in all sorts of languages. That maps pretty well to what wasm enables and it is a good starting point. Not to mention that it is a widely accepted set of annotations that gets us easy compatibility right out of the gate.
One of the design principles behind the Wasm OCI Artifact Layout is that it operates as a thin wrapper around Wasm Component binaries. Ideally this would mean that it is possible to decode an OCI image to a Wasm Component, and re-encode it back as OCI (roundtrip) without losing any information.
OCI images support a standard set of annotations for metadata, used by registries such as GitHub Container Registry and Azure Container Registry in their respective interfaces. These annotations are documented as part of the OpenContainers Annotation Spec. This specification contains metadata such as the date/time when the image was created, the license the image has, who the image was published by, and who the image was published by.
This data seems very useful to provide, and people are starting to provide that data already today. I would like to propose we establish tooling conventions for how to encode this data in custom sections inside of Wasm binaries today. That way language toolchains can directly encode that metadata as part of the binaries they produce. And all Wasm-specific OCI tooling has to do, is take that metadata from the components and encode that as OpenContainers annotations.
Before diving into any concrete proposal for which custom sections we might want to add and how we'd encode those - I wanted to raise this issue to put feelers out for how people might feel about this. I saw folks were generally positive about https://github.com/WebAssembly/tool-conventions/issues/141 which proposed adding a section on documentation, though the issue hasn't seen any activity for a while. If people are generally optimistic about the what I'm proposing here, I'd be happy to open a PR with initial wording for, say, SPDX license identifiers encoded in a custom section to get the ball rolling on this.
Thanks!