opencontainers / artifacts

OCI Artifacts
https://opencontainers.org
Apache License 2.0
224 stars 54 forks source link

OCI artifact manifest, Phase 1-Reference Types #29

Closed SteveLasker closed 1 year ago

SteveLasker commented 3 years ago

PR Status

On the July 21, 2021 OCI call, and additional OCI TOB discussion, the following plan of action was decided:

I'm leaving this PR open, and intact with the current files, to preserve the comments. We continue to implement and take input under oras-project/artifacts-spec


The OCI artifact manifest generalizes the use of OCI image manifest, by reducing the constraints on all artifacts, enabling specific artifact-specs to set constraints for their type. Phase 1 adds support for artifacts to reference other artifacts through a subjectManifest property enabling reference graphs, as those required for secure supply chain efforts.

Phase 1: Reference Types

The PR focuses on Phase 1, enabling reference type support in 2021, supporting secure supply chain artifact types including signatures and SBoMs.

Phase 2 Generic Artifact Versioning Support

Phase 2 will focus on the scenarios outlined in PR #37.

By splitting these out into phases, we can reduce the scope, for 2021, while providing time for phase 2 to evolve.

See: artifact-manifest.md for the overview of content, and artifact-manifest-spec.md for spec details.

Signed-off-by: Steve Lasker stevenlasker@hotmail.com

sudo-bmitch commented 3 years ago

This is looking really good. I am still trying to figure out how to tie in use cases that aren't attached to a specific image manifest, but instead the entire repository. Examples that come to mind are TUF targets and snapshots that represent the current state of all known signed images in a repository. Another example could be repository metadata of when it was created, who owns the repo, number of stars, number of pulls, etc.

Ideally, I'd like to have a way to query for these that doesn't conflict the the image tag namespace. If there's a way to query for an artifact by type, but without specifying the attached image digest, I think we'd have a solution.

SteveLasker commented 3 years ago

I am still trying to figure out how to tie in use cases that aren't attached to a specific image manifest, but instead the entire repository

Due to the high concurrency of content pushed/pulled to a registry, I don't believe we have a design to handle this. I'm also not sure we have a requirement.

Another example could be repository metadata of when it was created, who owns the repo, number of stars, number of pulls, etc.

This is yet another round of updates I'm hoping we can layer in, once we get past the new OCI Artifact Manifest discussions. See Adding Metadata Services to OCI Distribution-Draft for some initial thoughts. It would account for registries serving [read-only] content, such as pull count, "stars upon thars". I suspect the meta-data queries will come into the list API requirements as well. See Show/Get-Info API Requirements #232-Data Returned

nishakm commented 3 years ago

Is there any resolution for @jonjohnsonjr's suggestion on using the OCI index to map references? Something like:

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "size": 7143,
      "digest": "sha256:0228f90e926ba6b96e4f39cf294b2586d38fbb5a1e385c05cd1ee40ea54fe7fd",
      "annotations": {
        "org.opencontainers.image.ref.name": "stable-release"
      }
    },
    {
      "mediaType": "application/vnd.cncf.notary.v2+json",
      "size": 7143,
      "digest": "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f",
      "references":[
         {"type": "signature",
          "artifact": "sha256:0228f90e926ba6b96e4f39cf294b2586d38fbb5a1e385c05cd1ee40ea54fe7fd"
          }]
      }
  ],
  "annotations": {
    "com.example.index.revision": "r124356"
  }
}
SteveLasker commented 3 years ago

We feel it's best to move forward with the proposal in this PR to decouple from image.manifest and image.index. Using the new oci.artifact.manifest provides a clear definition for the references required for Notary & SBoMs. It also allows us to eventually add support for weak references, as sketched in #27

nishakm commented 3 years ago

We feel it's best to move forward with the proposal in this PR to decouple from image.manifest and image.index. Using the new oci.artifact.manifest provides a clear definition for the references required for Notary & SBoMs. It also allows us to eventually add support for weak references, as sketched in #27

There is some overlap between references and SPDX relationships. It seems to me that this could be useful here. Or maybe it's overkill and all we need is something describing an undirected/directed and mandatory/optional references.

SteveLasker commented 3 years ago

These manifests (oci.image, oci.index. oci.artifacts) are very coupled to how content is stored in a registry, enabling content discovery, acquisition and eventual cleanup. Storing documents like SPDX, 3T-SBoM or others also makes sense as they are content that just happens to be in a registry. Mixing different manifest types can confuse things. This is why I keep going back to what requirements are we trying to solve.

jonjohnsonjr commented 3 years ago

It would help me understand what's being proposed if you could revisit the OCI Artifact Manifest Properties section a bit.

I'd like to see a really rigorous description of these fields, similarly to how index and image are defined. Specifically, I want to understand what they mean.

Your current descriptions are really abstract and don't really describe the actual semantics of the fields. Let's separate out the format semantics from your expectations of how registries handle these so that we can discuss those individually. You also introduce a concept of "Extension artifacts" without defining what that is.

I have a feeling that you don't really care about the artifact format -- you actually care about the semantics of the relationships between artifacts. If I'm right, I would suggest that defining a new artifact is a terrible idea, and that what you actually want is to augment the properties of a descriptor such that we can express new kinds of relationships.

Your current proposal seems to be limited in that only new artifact manifests are allowed to have these new kinds of relationships, which seems inflexible and less powerful than enhancing the existing relationship abstraction we already have (the descriptor). I'd like to be able to express these kinds of things using existing formats and new formats. I don't want to have to invent a new format to express any other kinds of relationships we come up with.

SteveLasker commented 3 years ago

I just did a presentation on the new OCI Artifact Reference types and their supported scenarios and needs. The deck is here. As the videos are uploaded here I'll update with the specific link.

I have a few Notary v2 and ORAS updates to complete for Notary prototype-2. After that, I'll convert the current examples to an actual spec oci-artifact-manifest-spec.md, identifying the specifics you and others have been asking for.

For example:

We have been through several rounds of discussions for changing the descriptor or one of the existing manifests. These were all non-starters, with lots of filibustering. Rather than thrash existing schemas, implying a lot of instability to tooling that's already making lots of assumptions about the current manifests, we're focused on the new manifest to address the new needs. Since it's a superset of image.manifest, there's nothing stopping the current image tools from adopting it. It could be the basis of the versioning problem we're having with any changes to image.manifest.

jonjohnsonjr commented 3 years ago

The deck is here.

Is this available in a less hostile file format?

filibustering

sigh

Since it's a superset of image.manifest

I don't believe you understand what superset means.

It could be the basis of the versioning problem we're having with any changes to image.manifest.

This does not solve any problems with versioning. It's just a new version. There aren't any proposed mechanisms for how to change it that differ in any way from what we have today, as far as I can tell.

nishakm commented 3 years ago

Regarding requirements: What are they exactly? This is what I have been able to grok thus far:

From the garbage collection point of view, it makes sense to me that there needs to be a "root" that has all the connections to all of the artifacts, and OCI index seems to be a good candidate for it. But I can also see the need for something that describes all of these artifacts and their relationships and this is where the SBoM can actually help. Things like Helm charts and CNABs can have their own SBoM that describes all the related and required artifacts such as the container images and the signatures for the container images.

Regarding the digest of index.json, I don't think this is a problem. Folks want to know what changed and where in the artifact tree the change happened. IMHO, the digests are the versions.

nishakm commented 3 years ago

Your current proposal seems to be limited in that only new artifact manifests are allowed to have these new kinds of relationships, which seems inflexible and less powerful than enhancing the existing relationship abstraction we already have (the descriptor). I'd like to be able to express these kinds of things using existing formats and new formats. I don't want to have to invent a new format to express any other kinds of relationships we come up with.

IIRC, there were some concerns on allowing arbitrary content descriptors with regards to backwards compatibility with existing client tools. Initially, I had looked at content descriptors to describe things and their relationships. Unfortunately, "backwards compatibility" seems to be the de-facto reason for not including something in the spec so my recollection may be faulty.

Personally, I think there is nothing stopping registries from being instantiated as an "everything else" storage solution like bundle.bar and creating a whole distributed thingy around that, including a new artifact merkle DAG that has nothing to do with the image spec.

jonjohnsonjr commented 3 years ago

Initially, I had looked at content descriptors to describe things and their relationships.

I do this all over the place, and it's a good pattern. The content descriptor is a generic and useful abstraction, even outside of OCI, and I've been trying to get more people to adopt it instead of inventing new stuff.

Personally, I think there is nothing stopping registries from being instantiated as an "everything else" storage solution like bundle.bar and creating a whole distributed thingy around that, including a new artifact merkle DAG that has nothing to do with the image spec.

This is exactly how the registry is designed and works today. I'm fine with creating a new kind of generic node in the DAG if we think we need one, but defining the semantics of that will be tricky. As far as I know, all registries today are "strongly typed" in that they only know how to parse a small number node types (by their mediaType, as indicated in the Content-Type header): image and index.

Index is a list of pointers, so you can implement any kind of graph you want -- if you squint and think about Lisp, this is really powerful.

Image is a list of pointers + a special pointer. This is convenient, but not any more powerful than an index, really.

One unfortunate reality of dealing with registries in the wild is that there are vastly different interpretations of the image and registry specs, especially around garbage collection and what an image or index is allowed to reference. Can images only reference blobs? Can indexes reference blobs, or just manifests? What do we do if the registry doesn't understand a media type of a descriptor within a manifest? Should we just ignore it? Assume it's a blob? Assume it's a manifest? Are blobs and manifests in the same CAS namespace, or should those be treated separately -- e.g. if I push something through /manifests/ should it be readable through /blobs/ -- vice versa?

I've had a couple ideas around this (off topic but we can get into that if anyone is interested), but they would require registry operators to all agree on some semantics that are currently undefined and with mutually incompatible implementations :(

This is one reason I really want Steve to spell out the semantics of these new artifact types. Up until this point, we haven't defined anything about ref counting or garbage collection expectations. This new artifact type introduces requirements around that, so we need to address the baseline expectations of registries if we're going to layer on top of them. It doesn't make sense to define a weak reference if we don't also define a strong reference, or at least contrast the weak reference with "every other kind of reference is undefined behavior and registries can do whatever they want".

Unfortunately, "backwards compatibility" seems to be the de-facto reason for not including something in the spec so my recollection may be faulty.

I've brought up ~two separate concerns around backward compatibility, and I don't think I've done a great job of expressing my points, so let me try to clarify:

  1. If it's possible to adapt your use case to work with existing clients and registries such that we don't have to change anything and everything continues to work, we should do that. This was roughly the conclusion of the OCI Artifacts stuff, I believe.
  2. If we really need to add new functionality to clients or registries to support a new use case, let's do it in the least disruptive way possible:

I think we've gone past the first point and into the second point now, since registries will need to maintain or produce an inverted index for weak references. As I've said before, weak references and inverted indexes would be generally useful constructs for other artifacts, and I think they should be pulled out of this massive, confused proposal so that we can talk about the best way to go about implementing them in isolation.

I have a huge problem with just adding another artifact type and defining entirely new semantics for only that artifact type because it doesn't fit into the existing design of OCI data structures at all. We also ran into a similar problem with foreign layers, which I believe similarly landed in docker and OCI by fiat from Microsoft because it was a business requirement. It doesn't fit into the model, doesn't compose with other abstractions, is completely under-specced, and is a huge source of bugs -- they even have a CVE!

I'll try to explain again my issue with this, abstractly, in terms of boxes and arrows:

The current proposal defines a new type of box that is very slightly different in shape from the existing boxes, but the primary feature of this new type of box is that it has a new kind of arrow, even though those arrows are defined in the exact same way as arrows coming out of other boxes, and look identical, so there's no indication that they should be treated differently outside of the definition of the box. Also, only some of the arrows coming out of the new box are of the new kind.

image

At this point I don't really care about stopping Steve from defining a new artifact type. I think it's a bad idea, but my primary goal is just to make the design of the new mechanism not bad. These dashed arrows shouldn't be specific to an artifact manifest. We have already formally specified the behavior of arrows. Why can't we make "dashed" a property of an arrow instead of a property of the box that contains the arrow? The Descriptor definition specifically calls out that it should be considered for extension before doing format-specific things:

Extended Descriptor field additions proposed in other OCI specifications SHOULD first be considered for addition into this specification.

nishakm commented 3 years ago

Initially, I had looked at content descriptors to describe things and their relationships.

I do this all over the place, and it's a good pattern. The content descriptor is a generic and useful abstraction, even outside of OCI, and I've been trying to get more people to adopt it instead of inventing new stuff.

This section probably needs more examples then. I don't quite understand how This section defines the application/vnd.oci.descriptor.v1+json media type. and mediaType string: This REQUIRED property contains the media type of the referenced content relate.

Unfortunately, "backwards compatibility" seems to be the de-facto reason for not including something in the spec so my recollection may be faulty.

I've brought up ~two separate concerns around backward compatibility, and I don't think I've done a great job of expressing my points, so let me try to clarify:

1. If it's possible to adapt your use case to work with existing clients and registries such that we don't have to change _anything_ and everything continues to work, we should do that. This was roughly the conclusion of the OCI Artifacts stuff, I believe.

I thought this was not possible as existing clients will either try to spin up a set of blobs when they shouldn't or barf when encountering a manifest layout they do not understand.

2. If we really need to add new functionality to clients or registries to support a new use case, let's do it in the least disruptive way possible:

I'm not sure existing clients are capable of addressing supplemental or related artifacts. However, index.json sounds like it's capable of accommodating an "artifacts" manifest as Steve has described. The relationships/references thing can be discussed some more. My other concern with the content descriptor is the requirement to adhere to IANA descriptors. I suppose one could just use json, but I am still unsure how to actually use them 😅.

Unfortunately, most of my concerns around this proposal aren't really captured by the notary requirements, so it's hard to argue with Steve who will only consider concerns valid if they can be mapped directly to a notary v2 requirement.

I think other folks also have the need to be able to reference supplemental artifacts to verify supply chain integrity, provenance, etc. The spec, as it is, doesn't meet the base 3 requirements I had listed above. Can we start there instead?

SteveLasker commented 3 years ago

If it's possible to adapt your use case to work with existing clients and registries such that we don't have to change anything and everything continues to work, we should do that. This was roughly the conclusion of the OCI Artifacts stuff, I believe.

Artifacts "v1" was really about formalizing what people were already doing: stuffing additional content types in a registry, and just making them look like images, by using the same mediaTypes of an oci.image. While it was easier to identify the type through a formal manifest.artifactType property, it was felt to be too risky to make a breaking change to the schema, and we could just use manifest.config.mediaType. So we did.

The new oci.artifact.manifest supports a new reference type. To your point, these are considered strong references. the Weak references (#27) were deferred, for now. If the referenced artifact under [manifests] is deleted, the artifact referencing it should also be deleted (ref count -1). I'll get this written up in the oci-artifact-manifest-spec.md next week.

If we really need to add new functionality to clients or registries to support a new use case, let's do it in the least disruptive way possible:

The new oci.artifact.manifest is new, but not intended for the existing clients. In fact, it's explicitly avoiding the existing clients as a new manifest.mediaType, to assure we can innovate without breaking compact.

I think we've gone past the first point and into the second point now since registries will need to maintain or produce an inverted index for weak references. As I've said before, weak references and inverted indexes would be generally useful constructs for other artifacts

Yes, we will need a new index, which registry operators can choose their specific implementation. Just a minor point of clarity, as I'd like to think of these as strong/hard references. When you post an oci.image.manifest, the digests of the manifest must already exist in the registry/repo. If not, the manifest put fails. This will be the same for entries in [manifests]. It would not be the case with [references] as defined in the punted #27 proposal.

I think they should be pulled out of this massive, confused proposal

What is massive and confusing?

The new manifest is pretty straightforward. It's a new manifest to decouple from image-specific scenarios. This frees up OCI Image v2, and allows artifacts, which could be images, to evolve cleanly.

  1. A new manifest.artifactType property to decouple from manifest.config.mediaType
  2. [layers] renamed to [blobs]
  3. [manifests] collection for "hard links" to existing manifests in the same repo.

image

At this point I don't really care about stopping Steve from defining a new artifact type. I think it's a bad idea, Unfortunately, most of my concerns around this proposal aren't really captured by the notary requirements, so it's hard to argue with Steve who will only consider concerns valid if they can be mapped directly to a notary v2 requirement.

I'm mapping designs to meet requirements. Notary, SBoM, GPL Source, Nydus and other artifact types benefit from these. So, yes, these designs do map to requirements, not just Notary. If Notary v2 isn't adopted, these enhancements have value unto themselves. So, I'm not really sure what you're objecting to.

Usable workflows, enabled for the masses to easily create and consume Notary v2 signatures

We've incorporated a lot of great feedback, including the flow to push the image as a digest, push the signature, then do the tag update, so I think we're incorporating all relevant and actionable feedback. We've also demonstrated pretty clean workflows (nv2 demo script and nv2 video, so I'm still not sure what you're objecting to, or even what you're proposing. There's just a lot of debate. You don't have to agree. That's the beauty of opinions and extensions. You don't have to agree or even implement them.

The spec, as it is, doesn't meet the base 3 requirements I had listed above. Can we start there instead?

Can you list the 3 requirements?

nishakm commented 3 years ago

Can you list the 3 requirements?

I'm going to add a 4th one here: We need to be able to append artifacts based on their relationships

SteveLasker commented 3 years ago

Thanks @nishakm, All 3 are covered in this proposal. The PR has some examples manifests, for a signature and SBoM here

Below is an image that shows how the individual artifacts are linked together:

  1. net-monitor:v1 image
  2. 3 inked signatures of the net-monitor:v1 image
  3. An SBoM, linked to the net-monitor:v1 image
  4. A signature of the net-monitor:v1 SBoM
  5. Yet Another Artifact Type (YAAT), linked to the SBoM
  6. A signature of the net-monitor:v1- SBoM - YAAT.

All the downward arrows are represented by the existing manifests, and the config and [blobs] collection of the oci.artifact.manifest. The upward arrors represent the entries in the new [manifests] collection.

image

The target experience we're shooting for with the Notary prototype-2 is sketched here

ORAS will be used as a CLI, for demonstration purposes, but ORAS and nv2 will also provide libraries, so you can build this docker type experience

SteveLasker commented 3 years ago

Details on the oci.artifact.manifest spec provided. Including a change from manifests to references.

SteveLasker commented 3 years ago

I'm on the fence between using [manifests], [references], [manifest-refs] or something else. The intent is a collection of manifests, as OCI artifacts can refer to other manifests. It's not intended to refer to other blobs. While it dupes the name of manifests in the OCI Index, that's actually ok, as they both are a collection of manifests. The difference is the OCI Index is a "downward" collection of manifests that make up a thing, pivoted on platform/arch. While the OCI Artifact manifests are a reverse ("upward") reference to manifests, to extend their data.

The other thing to notice in this manifest is it's a subset of the oci-image restrictions. The intent dates back to the refactoring of various artifact types. Distribution supports all types of artifacts, based on a few manifests. OCI Artifacts is the means to generically define how something can be structured, to be stored. Then, you have various Artifact specs, including the image-spec, that take advantage of the various manifests.

The setup here is the image-spec could be a more narrowly defined use of the oci.artifact.manifest spec as it provides a superset of capabilities, with a subset of constraints. It also has clearly defined versioning semantics.

image

dlorenc commented 3 years ago

Is there an understood clear path for getting this merged/accepted/ratified? Since it's a brand new spec I'd guess that it would need to be approved by the full OCI TOB at some point, is that correct?

I'm trying to understand where this is in the process of going from draft to something that might be supported widely. What steps/approvals are left before registries would start implementing this?

SteveLasker commented 3 years ago

Latest update accounts for;

SteveLasker commented 3 years ago

We're doing some active validation of the oci.artifact.manifest spec in the Notary v2 working group: The lasted update add support, including /v2/_ext/oci-artifacts/v1/<repo>/manifests/<digest>/links?artifact-type=xyz to enable linked artifacts discovery

While this work will continue validations, we'd like to start putting 👀 on a newer proposal that solves the linked artifact references, and general versioning problems we've had with the image-spec. See WIP generic object spec #37

dlorenc commented 3 years ago

Was that last commit an accident? It looks like it was supposed to go here: https://github.com/notaryproject/artifacts/tree/prototype-2

SteveLasker commented 3 years ago

Just some doc updates while it's in draft mode.

sjackman commented 3 years ago

Homebrew https://brew.sh is a package manager that supports both macOS and Linux. We have binary packages (called bottles), which are tarballs, for multiple versions of macOS and one universal Linux bottle that works on all distributions. We store each bottle in an ORAS artifact in an image manifest. We bundle these image manifests up into a single image index. We use the .manifests[].platform object, which includes architecture, variant, os, and os.version to select which bottle to download. See https://github.com/opencontainers/image-spec/blob/master/image-index.md#image-index-property-descriptions

You can see examples of these ORAS image indexes at https://github.com/orgs/Homebrew/packages/container/package/core/hello and https://github.com/orgs/brewsci/packages/container/package/bio/seqkit

We use the media types application/vnd.oci.image.index.v1+json and application/vnd.oci.image.manifest.v1+json and even application/vnd.oci.image.layer.v1.tar+gzip for compatibility with oras, skopeo, even docker, though the Docker "image" can't be run due to not having the necessary dependencies included.

I'm not at all familiar with the proposal in this PR, and wasn't familiar with it when we came up with this solution. It sounds related though, and just wanted to share how we tackled this related issue.

Postscript

$ docker run ghcr.io/brewsci/bio/seqkit:0.15.0 seqkit/0.15.0/bin/seqkit version
seqkit v0.15.0

Incidentally, the image ghcr.io/brewsci/bio/seqkit:0.15.0 can be run, because it's a static executable with no dependencies, but that's not generally true of Homebrew bottles stored on GitHub Container Registry.

SteveLasker commented 3 years ago

Thanks @sjackman, The multi-arch angle to get multi-arch binaries is pretty cool. I'm curious why you stayed with the container image mediaTypes, vs. defining your own, vnd.brew.*? Is this because you can run them as container images with docker run? Or, because docker hub hasn't opened the mediaTypes yet?

sjackman commented 3 years ago

I'm curious why you stayed with the container image mediaTypes, vs. defining your own, vnd.brew.*? Is this because you can run them as container images with docker run? Or, because docker hub hasn't opened the mediaTypes yet?

Primarily to support uploading these image indexes using skopeo, so that we didn't need to reinvent that particular wheel. We store the images on GitHub Package Registry, so limitations of Docker Hub weren't a primary concern, although it's a bonus if the images can be stored on multiple registries. Downloading the images works with skopeo, oras, and even docker, though the Homebrew client just uses curl.

SteveLasker commented 3 years ago

Primarily to support uploading these image indexes using skopeo

Gotcha, so if skopeo supported flexible manifest.config.mediaTypes, that would enable you to identify the type in a registry, differentiating it from other types. Image-index wouldn't care what the config.mediaType is for the platform specific manifest.

t's a bonus if the images can be stored on multiple registries.

Docker Hub is actually the only registry I know of that doesn't support expanded mediaTypes. It's something they're working on.

sjackman commented 3 years ago

Skopeo may actually support different manifest.config.mediaType. I don't believe we tested precisely that. In the end we went with "annotations": { "com.github.package.type": "homebrew_bottle" } to distinguish Homebrew bottles from other images.

$ curl -s -H 'Accept: application/vnd.oci.image.index.v1+json' -H 'Authorization: Bearer QQ==' https://ghcr.io/v2/homebrew/core/hello/manifests/2.10 | jq -r '.annotations."com.github.package.type"'
homebrew_bottle
sjackman commented 3 years ago

I’m actually interested in looking into the possibility of making the Homebrew Docker images generally usable by docker run, by including their dependencies as layers, and perhaps one more layer for an OS if needed. It's just an idea right now, but it ought to work.

SteveLasker commented 3 years ago

by including their dependencies

Yup, this is the core of the manifest reference types in this PR. Package A depends on B & C However, Package B & C are also independently pullable. By having each defined as an artifact, you can declare dependencies between them.

Using the oci.artifact.manifest, and eventually #37, you can declare package A has a manifest reference to B.

By storing these as independent artifacts for each package type, you're not limited to a package having a single layer and all the annotations, signing and other aspects are maintained.

The multi-arch angle is just as interesting as you can declare platform-specific manifests, with the index pivoting on the platform.

The idea behind using the oci.image.manifest.config.mediaType, or the manifest.artifactType in this PR, is registries, security scanners, CLIs don't have to read specific artifact type annotations to understand it's a bottle vs. something else.

Here's some examples for using the mediaType, vs. annotations: https://github.com/opencontainers/artifacts/blob/2c9db9b2da2a357307e7043bb9142327dbdda0ca/authoring-artifacts.md

Buried in this PR is an early version that needs to be revived where you can specify the logo, localized strings that registries or clis could display when they encounter your artifact type. Compare to the way a filesystem knows what icons and actions to present based on the file extension: https://github.com/opencontainers/artifacts/blob/2c9db9b2da2a357307e7043bb9142327dbdda0ca/authoring-artifacts.md#defining-the-artifact-type

trishankatdatadog commented 3 years ago

Due to the high concurrency of content pushed/pulled to a registry, I don't believe we have a design to handle this. I'm also not sure we have a requirement.

But this is not a great answer. What if some use cases for other parties, such as those using TUF, do need it?

SteveLasker commented 3 years ago

PR is updated to reflect a Phase 1/Phase 2 approach, reducing the focus based on the Proposal: Working Group for Reference Types #96

This outlines a focus for 2021 with a reduced subset, providing time for #37 to naturally evolve.

dlorenc commented 3 years ago

Thanks for putting this together, sorry it's taken me so long to read through & review. Most of my questions can basically be summed up with "is this the minimum amount of changes we need to make to enable the reference tracking we need?".

In general, I think that we should try to make these changes in a way that's friendly to the existing implementation. Creating a new manifest type should be done only when we can't reuse the existing types, and I'm not sure that references alone meets that bar.

I agree here - I put together a proposal for that: https://github.com/opencontainers/image-spec/issues/827

There's an open question on whether or not my proposed change is actually "backwards compatible". I believe it is - but it's up to the image spec maintainers to make that determination. I've been waiting on getting resolution to https://github.com/opencontainers/image-spec/issues/834 before pushing more on this version.

dlorenc commented 3 years ago

For anyone following along, this discussion is moving here: https://github.com/opencontainers/artifacts/discussions/41

SteveLasker commented 3 years ago

Thanks @dlorenc, it’s not a move. Just an explanation for a portion of the PR to avoid long inline PR responses.

dlorenc commented 3 years ago

I have a meta/process question: is this specification intended to be a new top level OCI specification, like the image-format specification or the distribution specification?

Or will this be a new major version of the existing image specification? I could see both options as making sense (the image spec project also contains the type definitions for things like the descriptor and the index).

liubogithub commented 3 years ago

@liubogithub can you please resubmit this with a signature as it's failed dco

@SteveLasker Sure, I've sent you a pr for that, Thanks.

dlorenc commented 3 years ago

I've asked this a couple times but I think it's getting lost in the comments. From a process/governance perspective, what things have to happen for this to become an actual specification from the OCI?

I've read the OCI governance materials but I'm not sure I can tell what the actual steps are, and I'd rather not speculate. Can anyone more familiar chime in?

caniszczyk commented 3 years ago

@dlorenc TOB has to vote in new specs/projects, e.g., https://github.com/opencontainers/tob/pull/35

artifacts was approved previously here: https://github.com/opencontainers/tob/blob/master/proposals/artifacts.md

dlorenc commented 3 years ago

@dlorenc TOB has to vote in new specs/projects, e.g., opencontainers/tob#35

artifacts was approved previously here: https://github.com/opencontainers/tob/blob/master/proposals/artifacts.md

Thanks! I got that far. Where I got stuck is that the approved artifacts proposal doesn't seem to actually be a specification, but a repo containing a collection of media types, and changes to the distribution and image-specs: https://github.com/opencontainers/tob/blob/master/proposals/artifacts.md#proposal

Compared to the distribution one you linked, which is clearly a specification: https://github.com/opencontainers/tob/pull/35

Is there a difference between specification and project?

caniszczyk commented 3 years ago

specifications and projects are treated a bit differently in terms of the OCI IP Policy: https://github.com/opencontainers/tob/blob/master/CHARTER.md#8-oci-ip-policy (section d)

vs a library/tool like https://github.com/opencontainers/umoci

OCI generally doesn't consider a specification complete until it hits v1.0 and voted upon by those respective maintainers

hope that makes things a bit more clear!

In this case, I believe the @opencontainers/artifacts-maintainers would have to v1.0 the effort to make it something we'd consider a final release like all the previous specs

dlorenc commented 3 years ago

Thanks! Definitely.

nishakm commented 3 years ago

Here's a drawing if it helps anyone

artifact_manifest
SteveLasker commented 3 years ago

To clarify the current high-level differences with the artifact-manifest and the existing image-manifest:

Existing Image Manifest Proposed Artifacts Manifest
config REQUIRED config optional as it's just another entry in the blobs collection with a config mediaType
layers REQUIRED blobs, which renamed layers to reflect general usage are OPTIONAL
layers ORDINAL blobs are defined by the specific artifact spec. Helm isn't ordinal, while other artifact types, like container images MAY make them ordinal
manifest.config.mediaType used to uniquely identify different artifact types. manifest.artifactType added to lift the workaround for using manifest.config.mediaType on a REQUIRED, but not always used property, decoupling config.mediaType from artifactType.
subjectManifest OPTIONAL, enabling an artifact to extend another artifact (SBOM, Signatures, Nydus, Scan Results, )
/referrers api for discovering referenced artifacts, with the ability to filter by artifactType
Lifecycle management defined, starting to provide standard expectations for how users can manage their content. It doesn't define GC as an internal detail

The artifact manifest approach to reference types is based on a new manifest, enabling registries and clients to opt-into the behavior, with clear and consistent expectations, rather than slipping new content into a registry, or client, that may, or may not know how to lifecycle manage the new content. See Discussion of a new manifest #41

SteveLasker commented 3 years ago

Thanks for all the great feedback, including Hayley's great feedback above ^ around lifecycle management importance, and richer standards around manifests.

On the July 21, 2021 OCI call, and additional OCI TOB discussion, the following plan of action was decided:

Thank you for all the great feedback, and please help us round out artifacts under the oras-project, to continue to standardize registry capabilities for all artifact types.

SteveLasker commented 3 years ago

I got pinged by a few folks to keep this PR open for continued feedback, while the OCI Working group process evolves.

mikebrow commented 1 year ago

A lot of great content here... alas this draft will go read only soon as the artifacts mission is being moved to opencontainers/image-spec and this repo is being archived.

mikebrow commented 1 year ago

closing for now due to pending archive action.. pls reopen if archive is not completed and/or if you believe this close to be in error