Open neersighted opened 1 year ago
At the risk of oversimplifying (please correct me if I've misunderstood!), you're proposing we loosen/tighten the language around platform
to make sure it's clear it doesn't (have to) apply to non-image artifacts (even though they technically share the image mediaType
)?
Yes, that's essentially my thought. When artifacts had a separate mediaType, they did not have a SHOULD
for the platform; now that they share a mediaType, this applies.
I think the state before the separate mediaType/manifest definition was dropped made more sense, and that this line only ended up applying to artifacts by happenstance. I think usage of platform
in the descriptor should be supported/encouraged where it makes sense, but requiring all artifacts to specify a "nonsense" platform (read: meaningless to the Go toolchain) doesn't make sense to me.
Yeah, that seems sensible IMO; any idea how runtimes like Docker would react to a index member that doesn't have a platform object? (Thinking of the case where the index author was not as careful as they should have been about order and put the artifacts before the relevant images that should match first/instead.)
It's definitely highly runtime-dependent, but I do think runtimes should be updating their checks to account for "artifacts" (artifactType/config.mediaType). However, you make a good point.
Examining the artifacts decision tree, and comparing with dockerd behavior:
In all three examples, config.mediaType
is either application/vnd.oci.empty.v1+json
, or a custom (vendor-specific) type:
During pull, the mediaType of the config descriptor in the manifest is checked: https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/distribution/pull_v2.go#L423-L425 https://github.com/moby/moby/blob/605c8fb75dcfcd5d2d79dce5a58b330a6d5165dc/distribution/pull_v2.go#L495-L513
The default list of allowed mediaTypes is: https://github.com/moby/moby/blob/a1d966c492c9f3941e1b6627f3f1128e76a27a2d/distribution/registry.go#L40-L51
In all cases, this should fail early and not result in attempting to use an artifact as a runnable image.
We can see this testing with neersighted/artifact-demo
:
$ docker pull neersighted/artifact-demo:latest
latest: Pulling from neersighted/artifact-demo
unsupported media type application/vnd.oci.empty.v1+json
With the containerd storage backend, there is an IsPsuedoImage
check that should cover all cases based on the fact that it will see a empty or non-image layer "layer":
However, this is not where things fail; instead we see:
$ docker pull neersighted/artifact-demo:latest
latest: Pulling from neersighted/artifact-demo
failed to unpack image on snapshotter overlayfs: no match for platform in manifest sha256:a1fa4097c5355d6ca1656dbcf41e1a40e0763d054ff098f07064666f74f41186: not found
Caused by: https://github.com/containerd/containerd/blob/e62cacc4d6704038fbddce9bebd281155dcb967e/images/image.go#L236-L242 Which is in turn due to this matching logic: https://github.com/containerd/containerd/blob/e62cacc4d6704038fbddce9bebd281155dcb967e/images/image.go#L170-L177
Which ultimately does not match, as a requested platform does not match a nil platform in image, even though a nil requested platform matches the current platform in the image.
I am not sure if IsPsuedoImage
should be called to provide a better error message, or if failing on the containerd side is better; given that this functionality is a work-in-progress, I am not too concerned, however. If this is the correct place to centralize this logic, it should probably be expanded to check e.g. artifactType, or at least the config mediaType.
Discussed yesterday in the OCI call: No one had any strong objections to the concerns I've raised here, and further inconsistencies related to the artifact mediaType and the config.mediaType fallback were noted. I'll work on a PR this next week to address these concerns + those additional ones, erring in the direction of "refer to Artifacts as a proper noun, and better explain what Artifacts and Images are in a follow-up."
In the aftermath of #999/the removal of the artifact mediaType, I believe that the guidance for descriptors has become misleading.
Artifacts are now represented using the standard image manifest and mediaType:
https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/manifest.md?plain=1#L23-L27
When we had a separate mediaType/artifact manifest, the following line in the descriptor specification was informative, as it implies that "images" (read: runnable content) should have a platform, but "artifacts" (arbitrary content) may or may not:
https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/descriptor.md?plain=1#L61
However, now that there is one manifest mediaType, this advice appears to be rather unhelpful; all manifests are now the same type, so this implies that all manifests SHOULD have a platform in their descriptor.
In practice, this has lead to implementations including/believing they should include
{"os": "unknown", "architecture": unknown"}
when describing non-image content (artifacts), which I believe to be misguided/incorrect, as values for these fields SHOULD be understood by the Go toolchain:https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/image-index.md?plain=1#L57-L65
I'd like to drop this line from the descriptor spec, or amend it to acknowledge
artifactType
, and theconfig.mediaType
fallback (collectively "artifacts"); but between these options I have no strong preference.