opencontainers / distribution-spec

OCI Distribution Specification
https://opencontainers.org
Apache License 2.0
833 stars 206 forks source link

Remaining issues pre v1.0.0-rc2 (Accept headers, etc.) #211

Open jdolitsky opened 3 years ago

jdolitsky commented 3 years ago

Here a re a list of issues that came up in today's call (11/11/2020):

cc @jonjohnsonjr

jonjohnsonjr commented 3 years ago

Thanks for taking notes, I will circle back with more fleshed out details.

jonjohnsonjr commented 3 years ago
  1. Accept headers
    1. Covered by https://github.com/opencontainers/distribution-spec/issues/207#issuecomment-725937900 and https://github.com/opencontainers/distribution-spec/issues/212 in extraordinary detail. We need to include something about this in v1.0 because this is how registries currently behave, and clients need to understand what Accept headers to send if they want to successfully pull an image.
    2. We should probably link to https://docs.docker.com/registry/spec/manifest-v2-2/#backward-compatibility in a backwards compatibility section or something.
    3. If we're not going to have clients supply Accept headers, then that means every clients needs to support both manifest lists and images, which ties into my next point...
  2. Examples of types of manifests
    1. IMO, we should at least link to the manifest and image index sections of image-spec. These are what clients are likely to encounter, and they really do need to support both of these to support multi-platform images. @dmcgowan suggested just including these as examples.
    2. We should probably also link to media types in the image-spec somewhere, since registries and clients should both expect to see the equivalent docker media types.
  3. unclear manifest vs. blob
    1. Perhaps in some other section, include historical information about why /manifests/ and /blobs/ are different, and which media types are expected to be uploaded/downloaded to/from which endpoints. This has caused confusion in the past.
  4. existence check of a blob
    1. This should be the first thing most clients do before attempting to upload a blob, so we need to include it, and it should probably be the first section encountered under uploading blobs.
  5. Show "Pushing a blob in chunks" first since they are more broadly supported
    1. The POST/PATCH/PUT chunked uploading strategy is supported by every registry I've encountered (probably because that's what docker did), so we've modified our clients to only use this flow. Monolithic support may have caught up in the meantime, but as of a couple years ago, there were several smaller registries that did not support monolithic blob uploads.
    2. I believe containerd uses the POST/PUT monolithic upload strategy, so that is probably also widely supported (though not as widely as chunked).
  6. "A single POST request" should possibly be removed
    1. In the original spec, this method of blob upload was only barely mentioned as an optional thing for the blob upload initiation: Optionally, if the digest parameter is present, the request body will be used to complete the upload in a single request. I worry that not a lot of registries support this.
    2. This is going to be the most expensive blob upload method to implement for registries because there is no opportunity for the registry to offload the byte ingestion to a separate service (via the Location header, as in the other methods).
    3. I'd argue for removing it entirely, but would be happy if there were some caveats around its use. Derek mentioned maybe we could add something about "optimizations" like this, maybe in the FAQ from vsoch. There's no discussion in the spec about why you would choose one of these methods over another, which could go in that FAQ. No matter what we decide, this should definitely not be the first upload method that readers encounter.
  7. Content-Type: application/vnd.oci.image.manifest.v1+json does not include what is required for indexes
    1. Pretty self explanatory, but we should be consistent here. If we're going to include examples, that's fine. We should include both here. If we're just going to punt to the image-spec, that's fine, we should link to it and say something like "the media type of the manifest".
  8. "The uploaded manifest MUST reference any blobs that make up the artifact. However, the list of blobs MAY be empty."
    1. To elaborate here, the image-spec lists the config blob as REQUIRED for a manifest, so an image will absolutely reference at least one blob. It seems that layers can be empty, but it's really strange to actually call this out in the spec, given that it's in the image-spec and doesn't seem to really clarify anything to the reader.
    2. If we're going to mention this, we should at least explain why the manifest MUST reference blobs that make up an artifact -- to prevent registries from performing garbage collection on blobs that are unreferenced by manifests. This ties into the earlier blobs vs manifests discussion as well: there are different expectations around GC of content based on whether it's uploaded to /manifests/ or /blobs/. We can't be super prescriptive here, either, because some registries GC untagged images, and some don't. Some GC unreferenced blobs, and some don't. Perhaps this goes in some implementer's notes or FAQ with a link to it inline.
    3. Again, if we're going to mention this, we should at least link to non-distributable layers somewhere around here or in a section about GC. Both clients are registries need to know that they should handle them differently from other blobs.
  9. " The is a pullable manifest URL."
    1. This wasn't in the original spec, and I'm not sure what "pullable manifest URL" means.
    2. It seemed like this was something from HTTP around 201 response codes, looking at https://tools.ietf.org/html/rfc2616#section-10.2.2:

The request has been fulfilled and resulted in a new resource being created. The newly created resource can be referenced by the URI(s) returned in the entity of the response, with the most specific URI for the resource given by a Location header field. The response SHOULD include an entity containing a list of resource characteristics and location(s) from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content-Type header field. The origin server MUST create the resource before returning the 201 status code.

Compare to our language:

Upon a successful upload, the registry MUST return response code 201 Created, and MUST have the following header: Location: <location> The <location> is a pullable manifest URL.

It seems pretty obvious to me that this header isn't actually necessary. There is a well-known URL for fetching a manifest from the registry API. If a registry wants to returns a separate location (to, say, object storage), that seems fine to me, but is this really a MUST?

Do any registries provide this?

Do any clients look for it?

jonjohnsonjr commented 3 years ago

I will try to send a PR for some of these if I can find the time, but I understand that you want to cut 1.0 pretty soon, so I may move too slowly for that.

These are also just my own opinions, if other maintainers disagree, I'm happy to concede any or all of these points.

jdolitsky commented 3 years ago

@jonjohnsonjr thanks for descriptions. PRs appreciated, but will attempt to cover all the relevant points prior to 1.0

pmengelbert commented 3 years ago

@jonjohnsonjr Specifically, can you put together a PR for items 1 and 2 (Accept headers and links to image spec, etc). You have a great deal of experience with this and I imagine you can put together the language in about 20 minutes, which would save a lot of time vs. me doing my best first attempt and inevitably missing something and going back and forth on it.

If you aren't able to submit one, please let us know and we'll put something together early next week. In the meantime, we're working on items 3-5. Thanks!

Can anyone else volunteer to tackle one or two of the remaining items in a PR? We're looking to have a solid candidate for v1 in early December.

jdolitsky commented 3 years ago

@jonjohnsonjr - please see https://github.com/opencontainers/distribution-spec/pull/218/files as a simple note to address #1. Searched through the old spec and did not find much detail on the use of Accept header. This was adapted from that

jdolitsky commented 3 years ago

Going to pull from milestone as #218 addresses the Accept header (as a temporary solution)