opencontainers / distribution-spec

OCI Distribution Specification
https://opencontainers.org
Apache License 2.0
815 stars 202 forks source link

proposal: tighten digest verification requirements for clients #549

Open burgerdev opened 1 month ago

burgerdev commented 1 month ago

The use case

Suppose I have a release workflow that builds reproducible container images, for example using Bazel or Nix image builders. Reproducibility guarantees that I can rebuild the image from the same sources and arrive at the same digest. This digest is used to upload the image to a registry.

I would like to pull these images in a trusted environment, but from a potentially untrusted container registry. Suppose I received the expected digest through some trusted channel that does not involve the registry (e.g., I rebuilt it). Instructing my OCI client to pull the image by that digest should have one of two outcomes:

  1. The image is pulled and
    • the computed manifest digest matches the provided reference digest and
    • the pulled blobs match the digests from the manifest.
  2. The image pull fails.

The problem

The specification is not clear enough when it comes to digest verification. Quoting from https://github.com/opencontainers/distribution-spec/blob/v1.1.0/spec.md#pull regarding manifests:

A GET request to an existing manifest URL MUST provide the expected manifest, with a response code that MUST be 200 OK. A successful response SHOULD contain the digest of the uploaded blob in the header Docker-Content-Digest.

The Docker-Content-Digest header, if present on the response, returns the canonical digest of the uploaded blob which MAY differ from the provided digest. If the digest does differ, it MAY be the case that the hashing algorithms used do not match. [...] Most clients MAY ignore the value, but if it is used, the client MUST verify the value against the uploaded blob data.

If the manifest is not found in the registry, the response code MUST be 404 Not Found.

This allows a registry to respond to a registry/repo@sha256:... with a Docker-Content-Digest: sha512:..., but only requires the client to verify the registry-provided digest. This is not sufficient to ensure case (1) above.

For pulling blobs, there is a requirement for the registry, but it is acceptable to use a different hash algorithm in the response and the client is not required to check anything:

A GET request to an existing blob URL MUST provide the expected blob, with a response code that MUST be 200 OK. A successful response SHOULD contain the digest of the uploaded blob in the header Docker-Content-Digest. If present, the value of this header MUST be a digest matching that of the response body.

If the blob is not found in the registry, the response code MUST be 404 Not Found.

The proposal

Add a verification requirement to the spec:

If the <reference> part of a manifest request is a digest, clients SHOULD verify that the response body matches this digest. Clients MAY ignore the Docker-Content-Digest value, but if it is used, they MUST verify it.

Clients SHOULD verify that a pulled blob matches the request <digest>.

Enforcement should not make a difference for conformant registries: The requirements related to manifest existence imply that the response body of a 200 OK should also match the reference digest (because the registry found it under that digest), so it should be safe to fail the pull if it doesn't.

Server-side considerations

Some aspects have been discussed in #494. I agree to @ktarplee's argument that digest algorithm choice should be client-driven. That would allow to avoid the use cases which I believe motivated the canonical digest idea.

rchincha commented 1 month ago

Would container image signature schemes such as cosign or notation not work for you?

sudo-bmitch commented 1 month ago

I tend to agree that this would be a vulnerability for clients, and treated it as a security issue in my own project. https://github.com/regclient/regclient/security/advisories/GHSA-qv35-3gw6-8q4j

burgerdev commented 1 month ago

Would container image signature schemes such as cosign or notation not work for you?

@rchincha: Thanks for pointing this out, I forgot to mention it in the use case. Signatures can be a replacement for the rebuild check I described in the first paragraph, but only if I fully trust the signing authority. If the signing authority is compromised I'm open to being served bad content, so I essentially need to sign the images myself. Now I have to keep that signing key safe and come up with a secure signing workflow. It would be much easier for me if I only needed the digest.

tianon commented 1 month ago

Not a distribution-spec maintainer, but I've been here a while and agree that this was the whole point of pull-by-digest so any client that isn't validating the user-specified digest (and is instead only validating the server-returned digest) is doing it Wrong and we should absolutely find and fix them. :smile: