containers / image

Work with containers' images
Apache License 2.0
845 stars 366 forks source link

Fix c/storage destination with partial pulls #2288

Closed mtrmac closed 4 months ago

mtrmac commented 5 months ago

This is #2218 rebased on top of #2287, + my WIP work fixing issues as they come up.

Absolutely untested (do we have relevant CI in any project?).

See individual commits for details.

@giuseppe FWIW .

giuseppe commented 5 months ago

I am manually testing it, and it seems to work fine so far, I've found only one issue (caused by a recent change I've done in c/storage where a converted image returns the UncompressedDigest since it was fully validated instead of the TOCDigest), or do you think we should address it in c/storage and return both the TOCDigest and the UncompressedDigest (and then use from c/image the one we prefer): https://github.com/containers/storage/blob/d1bf4f0cf1d648fdd2e92f8841229a2da0bc3249/pkg/chunked/storage_linux.go#L1669-L1673?

I've added this patch to fix a pull of a converted to zstd:chunked image:

diff --git a/storage/storage_dest.go b/storage/storage_dest.go
index f73a3080..25fe0686 100644
--- a/storage/storage_dest.go
+++ b/storage/storage_dest.go
@@ -289,8 +289,8 @@ func (s *storageImageDestination) PutBlobPartial(ctx context.Context, chunkAcces
        return private.UploadedBlob{}, err
    }

-   if out.TOCDigest == "" {
-       return private.UploadedBlob{}, errors.New("TOC digest is empty")
+   if out.TOCDigest == "" && out.UncompressedDigest == "" {
+       return private.UploadedBlob{}, errors.New("digest is empty")
    }

    blobDigest := srcInfo.Digest
@@ -299,9 +299,12 @@ func (s *storageImageDestination) PutBlobPartial(ctx context.Context, chunkAcces
    s.fileSizes[blobDigest] = 0
    s.filenames[blobDigest] = ""
    s.diffOutputs[blobDigest] = out
-   if index != nil {
+   if index != nil && out.TOCDigest != "" {
        s.indexToTocDigest[*index] = out.TOCDigest
    }
+   if out.UncompressedDigest != "" {
+       s.blobDiffIDs[blobDigest] = out.UncompressedDigest
+   }
    s.lock.Unlock()

    return private.UploadedBlob{
mtrmac commented 5 months ago

Thanks; that seems to be out of scope of this correctness cleanup, but mentioning it is immediately valuable in how it is affecting the design of this cleanup. I have also added a link to https://github.com/containers/image/issues/2189 .


do you think we should address it in c/storage and return both the TOCDigest and the UncompressedDigest (and then use from c/image the one we prefer):

There’s actually a deeper design question here: Assume

both with the same image (in this order, or maybe in some other order), in the same store: should the two pulls share the layer (and result in the same storage.Image? Preferably yes / definitely not / nobody really cares?

My first instinct is to prefer the identification by UncompressedDigest, so that we reuse images to the maximal extent —but then if we do that, we could reuse an “ordinary” layer and setting convert_images does not guarantee any particular behavior.

Either way, probably c/storage should return both values, so that c/image can do the right thing for such mixed-identification layers.

And possibly to reuse the just-created layer in the rare case that a single image contains the same layer twice — whether that reuse happens by DiffID or by TOC.

mtrmac commented 4 months ago

@giuseppe If you happened to have too much spare time, this is structurally close (but the last commits are ugly), and I’d appreciate an early review. But please see #2291 first.

giuseppe commented 4 months ago

I've manually tested it and it seems to work fine.

I've some zstd:chunked images ready under quay.io/giuseppe/zstd-chunked if you'd like to try it as well.

Just make sure to add the following snippet to your storage.conf file:

[storage.options]
pull_options = {convert_images = "true", enable_partial_images = "true", use_hard_links = "false", ostree_repos=""}
mtrmac commented 4 months ago

@giuseppe Now ready for review. PTAL.

The FIXME? comments could be improved with https://github.com/containers/storage/pull/1830 .