Add `zstd:chunked` compression to built images.

There's a long arc of changes in container tooling to support zstd:chunked compression that supports better compression and better download capabilities in clients. However, old versions of docker (pre-2023) don't support it, which has held up adoption.

The request here is to add zstd:chunked compression in addition to the normal gzip compression for images when we push an OCI image index - so that new clients can take advantage of it, and old clients don't break. To illustrate, for an ordinary multi-arch image index, you would have 4 entries, one for each arch. The idea here is to have 8 entries, one for each arch with gzip compression, and one for each arch with zstd:chunked compression.

Idea number 1: I looked, and buildah manifest push has a --add-compression zstd:chunked option which (I think) will automatically add the extra image manifests on push.

However, I think we might have issues with image provenance. In a Konflux multi-arch build pipeline, we build each arch separately in a buildah task. Those produce image manifests, and their digests are recorded and provenance information is associated with them. We then produce the image index, record its digest, and record stuff there too.

When we validate an image at release time, we ensure that we can find provenance for both the image index as well as all of the referenced image manifests.

If we "magically" add an extra four manifests when we create the image index, then our policy check will fail unless their digests and pullspecs are somehow also exposed, so that chains can attest to their origin.

Idea number 2: Instead, perhaps our buildah tasks should be modified to first push the image with gzip compression, and then call buildah push again with a --compression-format zstd:chunked option to generate the second image manifest then, earlier in the process. Our buildah task would need to expose both digests.. hm.

Idea number 3: Perhaps we should add a new intermediary task that accepts the digest of the image-manifest producing buildah task as a param, pulls it, and then re-pushes it with zstd:chunked. This new task would expose the pullspec and digest of the zstd:chunked manifest as a result, which will induce chains to record provenance for it. The task that builds the image index would then take 8 inputs instead of 4. All 8 would be explicitly added to the image index.

Idea 1: It should be possible to modify the image index generation to add this functionality (optionally?) if desired. I added functionality last week to expose all image manifests linked to Chains: https://github.com/konflux-ci/build-definitions/pull/1204/files. We will need to be sensitive to the Tekton results limitations. Some users were hitting the limitation for the buildah task when we referenced 3-4 base images in the BASE_IMAGE_DIGESTS. This isn't a direct comparison because the quota for results was split across multiple task steps for buildah.

Since this change has already been made to the image index generation, its behavior would be consistent if any other approach is taken instead/in addition to the first idea.

Idea 2: This could potentially work. If Chains is able to generate provenance for recursive images as well as sign all image manifests then we could even add these multiple images within a common Image Index, merging all Image Indexes in multi-arch situations. Exposing multiple digests from buildah tasks would be hard because we currently have an assumption of one artifact being produced per pipeline. We would likely need a solution for that assumption before this approach can be implemented.

Idea 3: This is likely more feasible than number 2 due to the fact that we wouldn't be returning multiple references from a single task. The output from this task would need to be represented in the Image Index generation, so skipping of the task would mean that there are no results and therefore the image index generation would be skipped since all inputs are not present. This may be possible to work around with some form of matrix build which runs for each compression output desired. Then you would always have some results to be consumed by the image index task.

What would this look like in the registry?

We want an Image Manifest for each platform and each layer compression type. For example:

Image Manifest for amd64 with layers compressed with gzip
Image Manifest for amd64 with layers compressed with zstd:chunked
Image Manifest for ppc64le with layers compressed with gzip
Image Manifest for ppc64le with layers compressed with zstd:chunked
and so on

image_manifest_count = platforms_count + compression_type_count

Then we need to bring those altogether with an Image Index (aka Manifest List). For that, I don't believe we can have multiple entries for the same platform. The registry might allow you to create it, but it would be ambiguous to clients which Image Manifest to pick. Needs testing. The safest approach IMO is to provide an Image Index per compression type. In this case, we would have one Image Index that references the Image Manifest using gzip compression layers, and another Image Index for zstd:chunked compression.

We may want to consider this to be configurable for users. Let them decide which compression they want to use, one or both.

One approach could be to modify the buildah Task to handle this. It may produce multiple Image Manifests depending on the values of the COMPRESSION parameter, for example.

The Task that builds the Image Index can operate in a similar fashion. Where it could have a parameter for the gzip Image Manifests, IMAGES, and another parameter for the zstd:chunked Image Manifests, ZSTD_IMAGES. (Or it could just one big list and sort them out accordingly.) (NOTE: Emitting all the Image Manifest digest from this Task was added as a workaround to make Chains work properly with the Matrix feature. It is absolutely not necessary otherwise.)

We need to think about how the Image Manifests and Image Indexes are tagged in the registry if both compression types are in use. Use a convention? Leave it up to the user?

There's also the question about which image ref becomes the one that gets promoted to the application snapshot.

Then we need to bring those altogether with an Image Index (aka Manifest List). For that, I don't believe we can have multiple entries for the same platform. The registry might allow you to create it, but it would be ambiguous to clients which Image Manifest to pick. Needs testing.

@rhatdan tells me that clients that do support zstd:chunked will prefer the zstd:chunked manifests if they can find them. The older pre-2023 docker client will select whatever manifest it finds first for its arch, so we need to ensure that the gzip compression option comes first in the list for each platform.

ack. I couldn't quite get buildah manifest push --add-compression zstd:chunked to work. Keep getting the error Error: preparing instances for copy: EnsureCompressionVariantsExist is not implemented for CopySpecificImages. But that's likely a me problem.

Let's operate under the assumption that a single Image Index is sufficient.

Idea 1

However, I think we might have issues with image provenance. In a Konflux multi-arch build pipeline, we build each arch separately in a buildah task. Those produce image manifests, and their digests are recorded and provenance information is associated with them. We then produce the image index, record its digest, and record stuff there too.

As long as the new digests appear somewhere in a type-hint result of a TaskRun or the PipelineRun, Chains should sign them and add them to the SLSA Provenance. To clarify, Chains only ever generates a single SLSA Provenance for a PipelineRun. If adds all the image references to the subject list of the SLSA Provenance and attaches it to each of the image references. This seems like a reasonable thing to do.

Idea 2

This seems similar to Idea 1 but the zstd:chunked digests are recorded by the buildah Task instead of the build-image-manifest.

I think between the two, I like this one better because it's more isolated. The conversion happens when the image is built. (Also, I couldn't get buildah manifest push to make the conversion so I don't know for sure if that will work. If that's not possible, I believe that would rule out Idea 1.)

I don't necessarily think that emitting multiple image references from a single buildah Task is a problem. AFAIK, Konflux only cares about the image reference on the PipelineRun result. As long as that is still a single ref, i.e. Image Index, then it doesn't really matter how many image refs the TaskRuns emit. NOTE: With any of these approaches there's likely some Pipeline changes required by the users. We should try to make things as backwards-compatible as possible.

Idea 3

A dedicated Task for this seems overkill. What if someone only wants images with zstd:chunked compression?

Idea 2

Ah, right. The single reference is for the pipeline result, not a task result. The previous issue that we encountered with generating an image index at build time was resolved by https://github.com/konflux-ci/build-definitions/pull/1147 (i.e. changing it to only returning the image manifest).

NOTE: With any of these approaches there's likely some Pipeline changes required by the users. We should try to make things as backwards-compatible as possible.

Making this change might only require the addition of the new image references to the IMAGE_REF result so that all Image Manifests can be added to the Image Index with a single parameter ensuring that matrix builds would still work well. The Image Index task will then ensure that all Image Manifest references are exposed to Chains.

konflux-ci / build-definitions

Add `zstd:chunked` compression to built images. #1264