opencontainers / distribution-spec

OCI Distribution Specification
https://opencontainers.org
Apache License 2.0
823 stars 202 forks source link

Support sha512 content #543

Open sudo-bmitch opened 3 months ago

sudo-bmitch commented 3 months ago

Fixes #494.

Pending tasks before removing this from draft:

sudo-bmitch commented 2 months ago

CI is failing on this because Zot will need to be updated for the changed APIs. I've been testing against olareg so I know this can work. This gives me more reasons to want to rewrite the conformance tests to focus on a matrix of content and APIs to test. The current tests are getting difficult to follow and manage.

sudo-bmitch commented 2 months ago

@rchincha it would be good to get your feedback on this, and whether Zot would be able to support it.

rchincha commented 2 months ago

https://github.com/project-zot/zot/pull/2075 ^ on zot side

andaaron commented 2 months ago

Hi @sudo-bmitch

project-zot/zot#2075 ^ on zot side

This one was an intermediate step to replace hardcoded sha256 logic.

This one https://github.com/project-zot/zot/pull/2554 passes for me locally with the conformance tests in this PR. It implements the digest query parameter for pushing manifests.

rchincha commented 2 months ago

@sudo-bmitch at least in "draft" state, maybe update zot to point to @andaaron 's tree/commit. https://github.com/andaaron/zot/commit/6812b5c845211003130626a2f65b86fa1be575ee

sudo-bmitch commented 2 months ago

@sudo-bmitch at least in "draft" state, maybe update zot to point to @andaaron 's tree/commit. andaaron/zot@6812b5c

@rchincha I need the changes for this: https://github.com/opencontainers/distribution-spec/blob/11b8e3fba7d2d7329513d0cff53058243c334858/Makefile#L95-L105

Do you have a container image to point to?

rchincha commented 2 months ago

@sudo-bmitch Thinking about this some more ...

Maybe split this work into two parts - 1) first the ability to handle a non-sha256 hash so no further API changes, and 2) the ability to optimize via streaming hints. It may turn out that 2) may not be needed at all. Thoughts?

sudo-bmitch commented 2 months ago

@sudo-bmitch Thinking about this some more ...

Maybe split this work into two parts - 1) first the ability to handle a non-sha256 hash so no further API changes, and 2) the ability to optimize via streaming hints. It may turn out that 2) may not be needed at all. Thoughts?

@rchincha I'm not sure I follow. What would you include in the first part, the part that has no API changes?

rchincha commented 2 months ago

@sudo-bmitch Thinking about this some more ... Maybe split this work into two parts - 1) first the ability to handle a non-sha256 hash so no further API changes, and 2) the ability to optimize via streaming hints. It may turn out that 2) may not be needed at all. Thoughts?

@rchincha I'm not sure I follow. What would you include in the first part, the part that has no API changes?

I would focus only on making PUT /v2/...blob/?digest=sha512:... work first. Thoughts?

sudo-bmitch commented 2 months ago

I would focus only on making PUT /v2/...blob/?digest=sha512:... work first. Thoughts?

@rchincha since that's already in the spec, what would OCI need to do in that scenario? Does that involve closing this PR and doing nothing? Would we leave #494 unresolved?

rchincha commented 2 months ago

I would focus only on making PUT /v2/...blob/?digest=sha512:... work first. Thoughts?

@rchincha since that's already in the spec, what would OCI need to do in that scenario? Does that involve closing this PR and doing nothing?

I think we still need to resolve the question with guidelines of mixing and matching digests. Here is a sequence (really https://github.com/opencontainers/distribution-spec/issues/494#issuecomment-1846116233 rephrased)

  1. Customer one builds an image (say A) with sha256 and signs and pushes to a registry
  2. Customer two builds another image (say B) derived from A but with sha512 and push to a registry 2.1 If the manifest is pushed as sha512, do we then convert everything to sha512? 2.2 Or do we not? coz signatures and sucks for registries, and I would like to keep the ability to trace back image lineage etc. 2.3 If registries federate and pull from other registries, do they do 2.1 or 2.2 above? 2.4 Are clients supposed to do some of the above instead?
  3. Nice that we now support multiple digests, but what if one of them is broken, what is the path forward. A bad hash algo will bite us maybe 10 years from now. etc, etc ...

We can lean on some lessons from other content-addressable projects: https://lwn.net/Articles/898522/

My concern (IMHO) is we have not set a high-level direction or rationale yet for above first.

Would we leave #494 unresolved?

No, will resolve shortly after.

sudo-bmitch commented 2 months ago

@rchincha

I think we still need to resolve the question with guidelines of mixing and matching digests. Here is a sequence (really #494 (comment) rephrased)

  1. Customer one builds an image (say A) with sha256 and signs and pushes to a registry

  2. Customer two builds another image (say B) derived from A but with sha512 and push to a registry 2.1 If the manifest is pushed as sha512, do we then convert everything to sha512?

Once the manifest is pushed to the registry, the registry cannot modify it. Doing so would break pinned references and violate the content addressable guarantee.

Before the push, it's a client decision of how they generate the manifest. I'd recommend against changing the digest of layers from a base image. Changing the digest of blobs implies foregoing cross-repository blob mounts, increases the number of copies of content on a registry and that need to be pulled to the client, and breaks tools that support rebasing images.

2.2 Or do we not? coz signatures and sucks for registries, and I would like to keep the ability to trace back image lineage etc.

This.

2.3 If registries federate and pull from other registries, do they do 2.1 or 2.2 above?

Mirrors and the likes should not be modifying the manifest. They need to retain the content addressable guarantee.

2.4 Are clients supposed to do some of the above instead?

Clients are free to do anything when building. If they modify content and change the digest, it's now their responsibility.

  1. Nice that we now support multiple digests, but what if one of them is broken, what is the path forward. A bad hash algo will bite us maybe 10 years from now.

What if the broken algorithm is sha256 and we don't have a way to push content with other digest algorithms? What if registries respond to a broken algorithm by changing their default, breaking the ability to sign images and other uses of referrers? Will clients have to be configured differently for each registry they talk to? My goal is to solve this so we aren't stuck in one of these bad scenarios.

We can lean on some lessons from other content-addressable projects: https://lwn.net/Articles/898522/

My concern (IMHO) is we have not set a high-level direction or rationale yet for above first.

My goal is just to make it possible to push content with other algorithms, focusing on the currently supported algorithms first. Once that is done, I think we have a path forward to add new algorithms (keeping in mind that content built with a new algorithm is not portable to older registries or usable by older downstream consumers).

I'm not focused on which algorithms we add in the future (that's a question for image-spec). And I think any security questions should get lumped into that discussion, which we can have later.

Would we leave #494 unresolved?

No, will resolve shortly after.

It can't be after. Without resolving #494 we'll never get to 2, because it's not possible to push a tagged manifest with anything other than the registry desired digest algorithm (hence the reason sudobmitch/demo:sha512 is not referenced with a sha512 digest itself). That also means it's not possible to push referrers to a tagged manifest unless the referrers happen to use that same algorithm that the registry defaults to.

rchincha commented 2 months ago

Can we capture all this (rationale) somewhere? "This is how OCI is thinking about this" ...

andaaron commented 2 months ago

@sudo-bmitch at least in "draft" state, maybe update zot to point to @andaaron 's tree/commit. andaaron/zot@6812b5c

@rchincha I need the changes for this:

https://github.com/opencontainers/distribution-spec/blob/11b8e3fba7d2d7329513d0cff53058243c334858/Makefile#L95-L105

Do you have a container image to point to?

Hi @sudo-bmitch, can you try out ghcr.io/andaaron/zot-minimal-linux-amd64:v2.1.0-manifest-digest?

sudo-bmitch commented 2 months ago

Can we capture all this (rationale) somewhere? "This is how OCI is thinking about this" ...

@rchincha I think most of this is covered by "The registry MUST store the manifest in the exact byte representation provided by the client" but feel free to open a PR and we'll discuss there.

rchincha commented 2 months ago

Can we capture all this (rationale) somewhere? "This is how OCI is thinking about this" ...

@rchincha I think most of this is covered by "The registry MUST store the manifest in the exact byte representation provided by the client" but feel free to open a PR and we'll discuss there.

Let me take a stab at the text/content. Not sure where it should land yet. Maybe in the commit msg of this PR itself. IMO, important that an uninformed reader understands the context of this change and also for our sake.

rchincha commented 2 months ago

https://github.com/opencontainers/distribution-spec/pull/547

sudo-bmitch commented 2 months ago

Hi @sudo-bmitch, can you try out ghcr.io/andaaron/zot-minimal-linux-amd64:v2.1.0-manifest-digest?

@andaaron thanks, I've got it working with that image (along with some other cleanups).

sudo-bmitch commented 2 months ago

Something pointed out by @thaJeztah is this text from image-spec:

Implementations MAY implement SHA-512 digest verification for use in descriptors.

Without a MUST, that means this conformance test needs to be optional, and any content generated with a different digest should be considered non-portable, not just in practice, but also according to the spec.

rchincha commented 2 months ago

Without a MUST, that means this conformance test needs to be optional

dist-spec v1.2.0?

sudo-bmitch commented 2 months ago

Without a MUST, that means this conformance test needs to be optional

dist-spec v1.2.0?

It would have to be image-spec v1.2 changing the MAY to MUST language, semi-independent from distribution-spec. Clients would still need to be prepared for registries and other clients on older releases of each.

For distribution-spec, I think the conformance test really needs to be a matrix style of test for the different types of content running across the various APIs. And an early failure to push the content should skip the remaining tests. I think we should also reorganize for a single path through the APIs, push, query, pull, delete, instead of the setup/teardown per test we have now. I'd want to do that as a separate PR from this, and then make this PR add a new entry to the types of content to test.