containers / storage

Container Storage Library
Apache License 2.0
554 stars 236 forks source link

podman push --compression-format=zstd:chunked generates invalid OCI images #1771

Closed Romain-Geissler-1A closed 8 months ago

Romain-Geissler-1A commented 9 months ago

Issue Description

It seems that --compression-format=zstd:chunked generates invalid OCI images. When pushed in a registry, and then pull by docker, docker complains with "layers from manifest don't match image configuration".

Steps to reproduce the issue

Here is how to reproduce using the very latest podman upstream image:

> podman run -t -i --rm --privileged --pull=always quay.io/podman/upstream 
Trying to pull quay.io/podman/upstream:latest...
...
Writing manifest to image destination
[root@2cc5686c4046 /]# dnf install -y vim less zstd jq file golang-github-vbatts-tar-split

[root@2cc5686c4046 /]# jq ".rootfs" "image.gzip/blobs/$(jq -r ".config.digest" "image.gzip/blobs/$(jq -r ".manifests[0].digest" image.gzip/index.json | tr : /)" | tr : /)" { "type": "layers", "diff_ids": [ "sha256:8ff7ad910417a7b8a49019008335921d2aac0e3304a19ce258deabf431e59801" ] }


 - Now let's do like what docker does, and compute the actual sha256 sum of the uncompressed layer tarball. For gzip and zstd the sha256 sum is equal, and matches the config's rootfs DiffId. However for zstd:chunked, it doesn't match.

[root@2cc5686c4046 /]# zcat "image.gzip/blobs/$(jq -r ".layers[0].digest" "image.gzip/blobs/$(jq -r ".manifests[0].digest" image.gzip/index.json | tr : /)" | tr : /)" | file - /dev/stdin: POSIX tar archive (GNU) [root@2cc5686c4046 /]# zcat "image.gzip/blobs/$(jq -r ".layers[0].digest" "image.gzip/blobs/$(jq -r ".manifests[0].digest" image.gzip/index.json | tr : /)" | tr : /)" | sha256sum 8ff7ad910417a7b8a49019008335921d2aac0e3304a19ce258deabf431e59801 - [root@2cc5686c4046 /]# [root@2cc5686c4046 /]# zstdcat "image.zstd/blobs/$(jq -r ".layers[0].digest" "image.zstd/blobs/$(jq -r ".manifests[0].digest" image.zstd/index.json | tr : /)" | tr : /)" | file - /dev/stdin: POSIX tar archive (GNU) [root@2cc5686c4046 /]# zstdcat "image.zstd/blobs/$(jq -r ".layers[0].digest" "image.zstd/blobs/$(jq -r ".manifests[0].digest" image.zstd/index.json | tr : /)" | tr : /)" | sha256sum 8ff7ad910417a7b8a49019008335921d2aac0e3304a19ce258deabf431e59801 - [root@2cc5686c4046 /]# zstdcat "image.zstd-chunked/blobs/$(jq -r ".layers[0].digest" "image.zstd-chunked/blobs/$(jq -r ".manifests[0].digest" image.zstd-chunked/index.json | tr : /)" | tr : /)" | file - /dev/stdin: POSIX tar archive (GNU) [root@2cc5686c4046 /]# zstdcat "image.zstd-chunked/blobs/$(jq -r ".layers[0].digest" "image.zstd-chunked/blobs/$(jq -r ".manifests[0].digest" image.zstd-chunked/index.json | tr : /)" | tr : /)" | sha256sum 57b0ecf19f5d86d4002f7998b1f336f026d2c6301f74463234a76712d8d753a2 -


From what I understood of zstd:chunked, the idea was that the metadata would be added as annotation in the OCI image config, rather than inside hidden tar metadata (ie unlinke estargz). But it seems the generated tarball in the end is not the same.

 - Checking some basic "diff" at basic metadata level between zstd and zstd:chunked, I don't see any differences:

[root@2cc5686c4046 /]# diff -u <(zstdcat "image.zstd/blobs/$(jq -r ".layers[0].digest" "image.zstd/blobs/$(jq -r ".manifests[0].digest" image.zstd/index.json | tr : /)" | tr : /)" | tar -v -t) <(zstdcat "image.zstd-chunked/blobs/$(jq -r ".layers[0].digest" "image.zstd-chunked/blobs/$(jq -r ".manifests[0].digest" image.zstd-chunked/index.json | tr : /)" | tr : /)" | tar -v -t)


 - Now let's use tar-split (https://github.com/vbatts/tar-split from your Docker friends ;) ) to investigate a bit more the diffs at tar level:

[root@2cc5686c4046 /]# tar-split disasm --output tar-data.zstd.json.gz <(zstdcat "image.zstd/blobs/$(jq -r ".layers[0].digest" "image.zstd/blobs/$(jq -r ".manifests[0].digest" image.zstd/index.json | tr : /)" | tr : /)") | sha256sum INFO[0000] created tar-data.zstd.json.gz from /dev/fd/63 (read 182886400 bytes) 8ff7ad910417a7b8a49019008335921d2aac0e3304a19ce258deabf431e59801 - [root@2cc5686c4046 /]# tar-split disasm --output tar-data.zstd-chunked.json.gz <(zstdcat "image.zstd-chunked/blobs/$(jq -r ".layers[0].digest" "image.zstd-chunked/blobs/$(jq -r ".manifests[0].digest" image.zstd-chunked/index.json | tr : /)" | tr : /)") | sha256sum INFO[0000] created tar-data.zstd-chunked.json.gz from /dev/fd/63 (read 182880256 bytes) 57b0ecf19f5d86d4002f7998b1f336f026d2c6301f74463234a76712d8d753a2 -

[root@2cc5686c4046 /]# diff -u <(zcat tar-data.zstd.json.gz) <(zcat tar-data.zstd-chunked.json.gz) --- /dev/fd/63 2023-12-05 23:44:13.901880393 +0000 +++ /dev/fd/62 2023-12-05 23:44:13.901880393 +0000 @@ -16803,5 +16803,4 @@ {"type":2,"payload":"Li9saWI2NAAAAAA (many AAAA ommitted) AAAAADAwMDA3NzcAMDAwMDAwMAAwMDAwMDAwA DAwMDAwMDAwMDAwADE0NDU2MzQ1MjAwADAxMjIzMAAgMnVzci9saWI2NAAAAAAAA (many AAAA ommitted) AAAAAB1c3RhciAg AHJvb3QAAAAAAAAAAA(many AAAA ommitted) AAAAAAAAAAcm9vdAAAAAAAAAA (many AAAA ommitted) AAAAAAAAAA=","position":16802} {"type":1,"name":"./lib64","payload":null,"position":16803} {"type":2,"payload":"AAAAAAAAAAAAAAAA (many AAAA ommitted) AAAAAAAAAAAAA==","position":16804} -{"type":2,"payload":"AAAAAAAAAAAA (many AAAA omitted) AAAAAAAAAAAAAAAAA","position":16805} -{"type":2,"payload":"","position":16806} +{"type":2,"payload":"","position":16805}


I am not sure if I shall continue the investigation myself from there or if you want to continue on your side ;) But it seems there is some unexpected behavior with zstd:chunked compression.

### Describe the results you received

The generated layer tarball in zstd:chunked is not correct, while it is for gzip/zstd.

### Describe the results you expected

The generated layer tarball in should be the same for all compression, including zstd:chunked.

### podman info output

```yaml
Tried with the latest `quay.io/podman/upstream`.

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

Romain-Geissler-1A commented 9 months ago

Note that actually only after I finished writing this investigation I have found this issue which seems to deal with some similar concerns: https://github.com/containers/podman/issues/20611 So maybe my issue is just a duplicate of this known issue.

rhatdan commented 9 months ago

@giuseppe PTAL

giuseppe commented 9 months ago

I'll give it a try but this should be fixed with: https://github.com/containers/image/pull/1980

Romain-Geissler-1A commented 9 months ago

Ok thanks. As soon as this is vendored into podman and released into a quay.io/podman/upstream image, I can test as well the full flow on my side, with pushing to a registry and re-pulling back on docker side.

Note that in another scenario where I would like to adopt zstd:chunked, I would like to build images via buildx (connecting to a real docker daemon, as this is the only thing I have in Jenkins environment for now), export it to an oci-archive via buildx, then push it to a registry and re-compressing it on the fly to zstd:chunked via skopeo. And so far my attempt to create zstd:chunked image this way via skopeo instead of podman failed similarly, so I hope your fix will apply to skopeo too ;)

giuseppe commented 8 months ago

it seems still broken with the current version, taking a look now

giuseppe commented 8 months ago

opened a PR: https://github.com/containers/storage/pull/1772