opencontainers / image-spec

OCI Image Format
https://www.opencontainers.org/
Apache License 2.0
3.46k stars 636 forks source link

Descriptors, the platform object, and the artifactType #1131

Open neersighted opened 1 year ago

neersighted commented 1 year ago

In the aftermath of #999/the removal of the artifact mediaType, I believe that the guidance for descriptors has become misleading.

Artifacts are now represented using the standard image manifest and mediaType:

https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/manifest.md?plain=1#L23-L27

When we had a separate mediaType/artifact manifest, the following line in the descriptor specification was informative, as it implies that "images" (read: runnable content) should have a platform, but "artifacts" (arbitrary content) may or may not:

https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/descriptor.md?plain=1#L61

However, now that there is one manifest mediaType, this advice appears to be rather unhelpful; all manifests are now the same type, so this implies that all manifests SHOULD have a platform in their descriptor.

In practice, this has lead to implementations including/believing they should include {"os": "unknown", "architecture": unknown"} when describing non-image content (artifacts), which I believe to be misguided/incorrect, as values for these fields SHOULD be understood by the Go toolchain:

https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/image-index.md?plain=1#L57-L65

I'd like to drop this line from the descriptor spec, or amend it to acknowledge artifactType, and the config.mediaType fallback (collectively "artifacts"); but between these options I have no strong preference.

tianon commented 1 year ago

At the risk of oversimplifying (please correct me if I've misunderstood!), you're proposing we loosen/tighten the language around platform to make sure it's clear it doesn't (have to) apply to non-image artifacts (even though they technically share the image mediaType)?

neersighted commented 1 year ago

Yes, that's essentially my thought. When artifacts had a separate mediaType, they did not have a SHOULD for the platform; now that they share a mediaType, this applies.

I think the state before the separate mediaType/manifest definition was dropped made more sense, and that this line only ended up applying to artifacts by happenstance. I think usage of platform in the descriptor should be supported/encouraged where it makes sense, but requiring all artifacts to specify a "nonsense" platform (read: meaningless to the Go toolchain) doesn't make sense to me.

tianon commented 1 year ago

Yeah, that seems sensible IMO; any idea how runtimes like Docker would react to a index member that doesn't have a platform object? (Thinking of the case where the index author was not as careful as they should have been about order and put the artifacts before the relevant images that should match first/instead.)

neersighted commented 1 year ago

It's definitely highly runtime-dependent, but I do think runtimes should be updating their checks to account for "artifacts" (artifactType/config.mediaType). However, you make a good point.


Examining the artifacts decision tree, and comparing with dockerd behavior:

https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/manifest.md?plain=1#L170-L262

In all three examples, config.mediaType is either application/vnd.oci.empty.v1+json, or a custom (vendor-specific) type:

During pull, the mediaType of the config descriptor in the manifest is checked: https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/distribution/pull_v2.go#L423-L425 https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/distribution/pull_v2.go#L495-L513

The default list of allowed mediaTypes is: https://github.com/moby/moby/blob/a1d966c492c9f3941e1b6627f3f1128e76a27a2d/distribution/registry.go#L40-L51

In all cases, this should fail early and not result in attempting to use an artifact as a runnable image.

We can see this testing with neersighted/artifact-demo:

$ docker pull neersighted/artifact-demo:latest
latest: Pulling from neersighted/artifact-demo
unsupported media type application/vnd.oci.empty.v1+json

With the containerd storage backend, there is an IsPsuedoImage check that should cover all cases based on the fact that it will see a empty or non-image layer "layer":

https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/daemon/containerd/image_manifest.go#L81-L107

However, this is not where things fail; instead we see:

$ docker pull neersighted/artifact-demo:latest
latest: Pulling from neersighted/artifact-demo
failed to unpack image on snapshotter overlayfs: no match for platform in manifest sha256:a1fa4097c5355d6ca1656dbcf41e1a40e0763d054ff098f07064666f74f41186: not found

Caused by: https://github.com/containerd/containerd/blob/e62cacc4d6704038fbddce9bebd281155dcb967e/images/image.go#L236-L242 Which is in turn due to this matching logic: https://github.com/containerd/containerd/blob/e62cacc4d6704038fbddce9bebd281155dcb967e/images/image.go#L170-L177

Which ultimately does not match, as a requested platform does not match a nil platform in image, even though a nil requested platform matches the current platform in the image.

I am not sure if IsPsuedoImage should be called to provide a better error message, or if failing on the containerd side is better; given that this functionality is a work-in-progress, I am not too concerned, however. If this is the correct place to centralize this logic, it should probably be expanded to check e.g. artifactType, or at least the config mediaType.

neersighted commented 1 year ago

Discussed yesterday in the OCI call: No one had any strong objections to the concerns I've raised here, and further inconsistencies related to the artifact mediaType and the config.mediaType fallback were noted. I'll work on a PR this next week to address these concerns + those additional ones, erring in the direction of "refer to Artifacts as a proper noun, and better explain what Artifacts and Images are in a follow-up."