sigstore / protobuf-specs

Protocol Buffer specifications
Apache License 2.0
23 stars 29 forks source link

Bundle Media Type #165

Closed bdehamer closed 7 months ago

bdehamer commented 1 year ago

Currently there are two valid values for the bundle media_type field:

I'm working on a project where I want to push a Sigstore bundle to an OCI registry alongside the related image. As part of this I want to create an OCI manifest similar to the following:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
  "config": {
    "mediaType": "application/vnd.oci.empty.v1+json",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2
  },
  "layers": [
    {
      "mediaType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:4bd9df17d3cfa8632690f6251b7dc6d2f7cebd60313c49bea4092b9489e2d4a4",
      "size": 4967
    }
  ],
  "subject": {
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "digest": "sha256:010511b82573da0735bbbc09ab0b1b9e9218732306d96b81beb694cfe431a499",
    "size": 523
  },
}

Note that I'm including the Sigstore media type in the manifest's artifactType field and as the mediaType for the layer which points at specific blob which contains the actual bundle.

I've found that at least one of the major image registries (AWS ECR) chokes on this manifest and it appears to be specifically related to the bundle media type. It seems to reject any value which does not strictly conform to the RFC 6838 spec (which is the one which is called-out in the OCI image spec). RFC 6838 is more constrained than RFC 2045 and does NOT allow for the sort of "version=0.2" parameter addition we're using for the Bundle media type.

Sure enough, if I remove the ";version=0.2" bit, AWS will happily accept my manifest!

So, I can solve my problem simply be removing the offending portion of the media type string, but it's a bummer to lose the version information.

Would it make sense to rejigger the standard media type strings to embed the version info as part of the actual name instead of being tacked-on as a parameter?

I'm looking at some of the OCI-defined media types as possible templates:

Note how the version information is embedded as another segment of the name.

Using something like the following conveys all the same information we have today but does so in a format that is a bit more portable:

haydentherapper commented 1 year ago

I see no issue with this, as it's conveying the same information. This would require a major version bump for the specification and getting all clients on board. I assume we'd want clients to accept both formats for the foreseeable future to not break verification for existing bundles.

Do we need to change v1 bundles? Ideally we have no clients generating v1 bundles still, only verifying them.

bdehamer commented 1 year ago

Do we need to change v1 bundles? Ideally we have no clients generating v1 bundles still, only verifying them.

Good call, there's probably no reason to mess with the v0.1 string.

woodruffw commented 1 year ago

+1, no objection on my end either!

sigstore-python is currently generating only "v0.2" bundles, so this would not be a significant lift on our end (just another string to check).

Do we need to change v1 bundles? Ideally we have no clients generating v1 bundles still, only verifying them.

Agreed, ideally we'd only change v2. Or really just make this a brand new v3, since versions are cheap and this is a semantic break anyways 🙂

kommendorkapten commented 1 year ago

As we are seeing behaviours that may hinder OCI usage, we could switch to the proposed format.

But I would not say the current format is invalid according to RFC6838. The format for a media type is as follows

content := "Content-Type" ":" type "/" subtype 
                   *(";" parameter)

The section linked by @bdehamer places restrictions on the type and subtype names (by specifying the restricted-name syntax. In Section 4-3 the syntax for a parameter is defined. It's

parameter-name = restricted-name

So the current media type is valid! Also, using parameters to indicate versions, is one of the called out examples in the RFC (see section 4-3):

An example of this would be a "revision" parameter to indicate a revision level of an external specification such as JPEG. Similar behavior is encouraged for media types registered in the vendor or personal trees, but is not required.

bdehamer commented 1 year ago

@kommendorkapten thanks for the explainer!

To be clear, I'm just trying to make sense of why AWS might be rejecting the particular media type we're using. Given that Section 4-3 gets special mention in the OCI image spec, my best guess is that they implemented support for ONLY that bit 🤷🏻‍♂️.

My proposal here is simply an effort to increase the compatibility (and it seemed easier than trying to get a change in AWS).

kommendorkapten commented 1 year ago

Yeah, that could be the case. I'm not against switching if that actually makes it easer.

If we switch and encode the version in the media-type, we should also define how it should be parsed, some thing like:

The encoded version follows semantic versioning, where the last digit is the minor version, and the preceding digits encode the major version

I.e

haydentherapper commented 1 year ago

Is it only AWS you’re seeing failures? Worth reaching out to them if so?

bdehamer commented 8 months ago

Opened an issue with AWS to see if we can get any traction on this: https://github.com/aws/containers-roadmap/issues/2306

If we can get a few people to 👍 this issue maybe we can get some attention on it.

bdehamer commented 7 months ago

@kommendorkapten / @haydentherapper / @woodruffw

I'd like to re-open this discussion . . . initially, I'd proposed this as a way to work-around issues that I'd found when trying to use our bundle media type in conjunction with AWS ECR, but it appears that the problem may be broader than that.

In response to my spec for storing Sigstore bundles as OCI artifacts, one of the OCI contributors weighed-in and explained that OCI really only requires registries to comply with section-4.2 of the media type spec. We may actually encounter more compatibility issues in the future as registries roll out explicit support for OCI 1.1.

Some additional commentary from @sudo-bmitch here:

The field needs to follow the naming requirements in RFC 6838 Section 4.2, and parameters are an addition that isn't part of the media type itself. Registries that validate the field may reject the manifest. The way OCI has done this in other places is to include the version in the media type, e.g. something like: application/vnd.dev.sigstore.bundle.v0.2+json.

Also, I like the suggestion to use a dotted version string (v0.2) over my original proposal (v02) as it eliminates the semantic parsing problems that Fredrik mentioned.

I'd like to propose that we establish the new media type which encodes the version information within the media type name itself instead of using trailing parameters (application/vnd.dev.sigstore.bundle.v0.4+json) and follow this pattern for all future media type values.

woodruffw commented 7 months ago

Still no objection from me! IMO we should do whatever interoperates best.

(I suppose this change means that a bundle upgrading tool should also make this change, and that clients should warn when they see the older media type.)

kommendorkapten commented 7 months ago

That was not clear from the OCI spec that only the naming rules was used for parsing!

Anyway, yes, we should switch the media type to what is the most interoperable. I can wrap up a PR for this.