containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.15k stars 766 forks source link

Add compression_format = inherit #5586

Open cgwalters opened 3 weeks ago

cgwalters commented 3 weeks ago

Related to https://github.com/containers/common/pull/2048

How about adding compression_format = inherit? The semantics would be that we default compression for new layers to previous base image layer by default. IOW, we default to e.g. zstd:chunked only if we detect at least one usage of it in a parent layer.

That would greatly reduce the "flag day" effect of this and allow individual image maintainers to opt in as they desire.

mtrmac commented 3 weeks ago

That’s not really in the c/image “copy” model — if the layers inputs to a copy are compressed, the copy (unless instructed otherwise) preserves the format of each layer; if the inputs are uncompressed, we make a choice, and that’s what the “compression format” option chooses.

And, in particular, all buildah outputs written to c/storage are inherently uncompressed. By the time a “push” happens, the distinction between original and new layers is ~lost, so this would need to be a Buildah feature (recording the format used in (one specific layer of?) the FROM image during a build and specifying it as the format to use during a later during a push; and consider buildah build+podman push).

[We do try to preserve the original manifest during pulls, so for a pulled image, we typically know what the original compression was. AFAICS that information is lost during a build. We might plausibly try to preserve the MIME types on this path … but I think adding a new field to the image object would be more direct and reliable.]


I see the argument that “if we are building of a zstd:chunked base image, we can ~safely assume that consumers of derived images can also consume zstd:chunked”… but I struggle a bit with who would actually turn this new logic on.

It seems to me that “inherit” would help us keep containers-common the same across Fedora versions, but the cost of actually implementing “inherit” would be quite a bit larger than the effort saved.

cgwalters commented 3 weeks ago

but I struggle a bit with who would actually turn this new logic on.

I think this would in practice replace the other values of compression_format - it'd become the new upstream default. The use case for turning on a specific compression format explicitly seems notably weaker after it exists since most container builds are derived; anyone who really wants it can opt in today with podman push --compression-format etc.

It seems to me that “inherit” would help us keep containers-common the same across Fedora versions, but the cost of actually implementing “inherit” would be quite a bit larger than the effort saved.

We already hit a problem where someone built a container on a Fedora 40 system that temporarily had zstd:chunked on by default, and pushed it to OCP where the ostree-container doesn't yet handle this (which we'll probably back port the fix, but still). Come Fedora 41 time those types of issues will still reoccur, though it's more likely that fixes have landed by then.

So the cost is things like that - also maybe old registry implementations?

But yes, it sounds like the code wouldn't be trivial, so...dunno, we can just keep this idea in our back pocket I guess and pull it out if we find wider fallout than just older ostree-container versions.

mtrmac commented 3 weeks ago

I’ll transfer it to buildah, where we have the connection between the base image and the build product.