cue-labs / oci

Go modules related to OCI (Open Container Initiative) registries
Apache License 2.0
22 stars 3 forks source link

Add `knownDigest` to `ociclient`'s `Resolve*` methods #29

Closed tianon closed 3 months ago

tianon commented 4 months ago

In the OCI spec, having HEAD requests return the digest header is technically only a "SHOULD" (not a "MUST"), and it turns out that Docker Hub is one of the registries on the sad side of that "SHOULD" -- the code elsewhere was already accounting for this via the knownDigest parameter to descriptorFromResponse, but it was (accidentally?) set to "" explicitly for all the resolve methods (even though 2 out of 3 of them are by-digest):

https://github.com/opencontainers/distribution-spec/blob/v1.1.0/spec.md#checking-if-content-exists-in-the-registry

A HEAD request to an existing blob or manifest URL MUST return 200 OK. A successful response SHOULD contain the digest of the uploaded blob in the header Docker-Content-Digest. A successful response SHOULD contain the size in bytes of the uploaded blob in the header Content-Length.

As a result, my attempts at using ResolveBlob to check if a specific blob exists on Docker Hub were failing with "invalid descriptor in response: no digest found in response" even though the proper digest was in the request and the response was 200 OK:

curl on Docker Hub showing it being on the sad side of that SHOULD: ```console $ curl -H "$(crane auth token --header busybox)" --head -L https://registry-1.docker.io/v2/library/busybox/blobs/sha256:7b2699543f22d5b8dc8d66a5873eb246767bca37232dee1e7a3b8c9956bceb0c HTTP/1.1 307 Temporary Redirect content-type: application/octet-stream docker-distribution-api-version: registry/2.0 location: https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/7b/7b2699543f22d5b8dc8d66a5873eb246767bca37232dee1e7a3b8c9956bceb0c/data?verify=1709871346-z5HS81nN1Ov9OEB3RdN60eJ2MjA%3D date: Fri, 08 Mar 2024 03:25:46 GMT strict-transport-security: max-age=31536000 HTTP/2 200 date: Fri, 08 Mar 2024 03:25:46 GMT content-type: application/octet-stream content-length: 2152262 cf-ray: 860fb84bfd2f2b58-LAX cf-cache-status: HIT accept-ranges: bytes age: 170784 cache-control: public, max-age=14400 etag: "2aeecf61106ed1c6e3a774e3db220ca7" expires: Fri, 08 Mar 2024 07:25:46 GMT last-modified: Wed, 06 Mar 2024 00:07:43 GMT vary: Accept-Encoding x-amz-id-2: GIEh93BaQl88NHOZYTVLgwDtBjOzY96VsTcyP2qql4lrK98VY+vcj1wOxHAT47ChDa+wRCKv7Pk= x-amz-request-id: 8YB4XDY06KMSY2GN x-amz-server-side-encryption: AES256 x-amz-version-id: oxJ9x0zrl0ICqBGgSAIDsycrCQ9qVxw5 server: cloudflare ```

I have verified that this fixes the issue and allows the ResolveBlob call to complete successfully against Docker Hub.

FWIW, I implemented this to mimic other uses of the descriptorFromResponse function, such as: https://github.com/cue-labs/oci/blob/f3720d0e1bec6540a9b3c8783af010f51ad5cc95/ociregistry/ociclient/reader.go#L138

(but I'm happy to rebase, amend, adjust, etc :heart:)

rogpeppe commented 3 months ago

Brilliant, thanks for catching this! I'd suggest one other little thing to keep us honest: I added ociserver.Options.OmitDigestFromTagGetResponse so that we could have a local server that behaves in the same way as Docker Hub in this respect, but I forgot to make the corresponding change in HEAD as GET.

So I'd suggest changing ociserver.registry.handleManifestHead to mirror handleManifestGet in that respect (if !r.opts.OmitDigestFromTagGetResponse { etc), then temporarily backing out the change in this PR to verify that the conformance tests do indeed fail as expected before updating the PR with the fix in place.

Thanks again for the fix.

tianon commented 3 months ago

Yeah, absolutely! Happy to update.

I figured I'd just develop the test update on main before applying it to my PR, but I followed what you've explained and it doesn't seem to fail the tests -- are they invoking Resolve* already, or do they also need to be updated to exercise those endpoints?

$ git log -1 --oneline
f3720d0 (HEAD -> main, origin/main, origin/HEAD) ociclient,ociauth: breaking change: use standard http.Transport
$ git --no-pager diff
diff --git a/ociregistry/ociserver/reader.go b/ociregistry/ociserver/reader.go
index bd5952a..851e7bb 100644
--- a/ociregistry/ociserver/reader.go
+++ b/ociregistry/ociserver/reader.go
@@ -147,7 +147,9 @@ func (r *registry) handleManifestHead(ctx context.Context, resp http.ResponseWri
    if err != nil {
        return err
    }
-   resp.Header().Set("Docker-Content-Digest", string(desc.Digest))
+   if !r.opts.OmitDigestFromTagGetResponse {
+       resp.Header().Set("Docker-Content-Digest", string(desc.Digest))
+   }
    resp.Header().Set("Content-Type", desc.MediaType)
    resp.Header().Set("Content-Length", fmt.Sprint(desc.Size))
    resp.WriteHeader(http.StatusOK)
$ go test ./... && echo pass || echo fail
?       cuelabs.dev/go/oci/ociregistry  [no test files]
?       cuelabs.dev/go/oci/ociregistry/internal/exp/constraints [no test files]
?       cuelabs.dev/go/oci/ociregistry/internal/exp/maps    [no test files]
?       cuelabs.dev/go/oci/ociregistry/internal/exp/slices  [no test files]
ok      cuelabs.dev/go/oci/ociregistry/internal/ocirequest  0.004s
?       cuelabs.dev/go/oci/ociregistry/ocidebug [no test files]
?       cuelabs.dev/go/oci/ociregistry/ocitest  [no test files]
ok      cuelabs.dev/go/oci/ociregistry/ociauth  1.128s
ok      cuelabs.dev/go/oci/ociregistry/ociclient    0.008s
ok      cuelabs.dev/go/oci/ociregistry/ocifilter    0.004s
ok      cuelabs.dev/go/oci/ociregistry/ocimem   0.005s
ok      cuelabs.dev/go/oci/ociregistry/ociref   0.006s
ok      cuelabs.dev/go/oci/ociregistry/ociserver    0.028s
ok      cuelabs.dev/go/oci/ociregistry/ociunify 0.003s
pass
tianon commented 3 months ago

(Rebased and committed the OmitDigestFromTagGetResponse change for now :bow:)

tianon commented 3 months ago

Ohhhhh, the conformance tests are a separate module, so my go test was not testing them! Doh, now I see. I'll do some more testing. :bow: :heart:

Edit: there we go!

$ go test ./ociregistry/internal/conformance/...
--- FAIL: TestClientAsProxyWithOmittedTagGetDigest (0.51s)
    --- FAIL: TestClientAsProxyWithOmittedTagGetDigest/extra (0.07s)
        --- FAIL: TestClientAsProxyWithOmittedTagGetDigest/extra/largeManifest (0.03s)
            quicktest.go:12: 
                error:
                  values are not equal
                comment:
                  response body: "{\"errors\":[{\"code\":\"UNKNOWN\",\"message\":\"no digest header found in response\"}]}"
                got:
                  int(500)
                want:
                  int(200)
                stack:
                  /wd/ociregistry/internal/conformance/conformance_test.go:387
                    qt.Assert(t, qt.Equals(resp.StatusCode, http.StatusOK), qt.Commentf("response body: %q", data))
                  /wd/ociregistry/internal/conformance/conformance_test.go:231
                    test.run(t, client)

FAIL
FAIL    cuelabs.dev/go/oci/ociregistry/internal/conformance 4.217s
FAIL
tianon commented 3 months ago

Ok, that helped me find another line that needed the same fix, but it doesn't actually fix the conformance tests because they apparently do a HEAD on a tag, which per the spec is not required to return a digest (???? :sob:), but then the tests fail because they don't find a digest and don't have a digest. Not sure how to resolve that. :disappointed:

tianon commented 3 months ago

https://github.com/cue-labs/oci/blob/1096806a082ed36ddb9117c4bc5d7c7aac3ee08d/ociregistry/ociclient/reader.go#L153-L157

Perhaps we should update this code for if the HEAD fails to provide a sufficient digest to fall back a third time to doing a GET but being careful to stream the data straight into a digest calculation so it doesn't read the entire thing into memory? I guess that would mean we need to stream the data from the server twice, so maybe not ideal?

rogpeppe commented 3 months ago

it doesn't actually fix the conformance tests because they apparently do a HEAD on a tag, which per the spec is not required to return a digest

I think in this case we are probably justified in requiring the response to contain a digest, despite the wording of the spec. Otherwise there is no way to resolve a tag to a digest without reading the entire manifest, which is surely wrong.

By way of justification, at least one other client library fails if the response doesn't contain the digest.

With that in mind, perhaps the ociserver check inside handleManifestHead should be this:

if !r.opts.OmitDigestFromTagGetResponse && rreq.Tag == "" {
    // Note: when doing a HEAD of a tag, clients are entitled
    // to expect that the digest header is set on the response
    // even though the spec says it's only optional in this case.
    // TODO raise an issue on the spec about this.
    resp.Header().Set("Docker-Content-Digest", string(desc.Digest))
}

WDYT?

tianon commented 3 months ago

Sounds good to me! :+1: (but I think you meant if !r.opts.OmitDigestFromTagGetResponse || rreq.Tag != "" { for that conditional :smile:)

tianon commented 3 months ago

Updated with the new conditional that accounts for tag, your suggested comment, and a rebase on main for good measure. :rocket:

tianon commented 3 months ago

Hmm, on the less positive side, making just the ociserver change does not fail the tests on main, so the existing tests must not have coverage for the very specific case being tested here :disappointed:

tianon commented 3 months ago

@rogpeppe thoughts on this one? :bow:

It's necessary for using this API against Docker Hub, and I think the only hesitation is that the tests don't actually have good coverage on it. :disappointed:

(I'm currently relying on this PR's branch in my project, so that's why I've rebased above so I can pull in and benefit from your other changes :smile:)

rogpeppe commented 3 months ago

It's necessary for using this API against Docker Hub, and I think the only hesitation is that the tests don't actually have good coverage on it.

I looked into this a bit more. It seems the tests do cover this, but the client was being unwarrantedly lax about requiring digests in some cases. I've made some adjustments that demonstrate that this fix is covered by the tests, but I'll add that in another CL.

For now, this LGTM, thanks!

tianon commented 3 months ago

Very nice, thank you!! 🚀