containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.98k stars 2.34k forks source link

Podman digest SHA locally does not match remote digest after upload to Amazon ECR #14779

Open nathanpeck opened 2 years ago

nathanpeck commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When a container image is built its local digest SHA as seen from podman image list <image-url> --digests does not match with what Amazon ECR reports as the image digest SHA.

Steps to reproduce the issue:

  1. Build a container image locally and tag it with an Amazon ECR registry URL:

    podman build -t 209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo:latest .
  2. Push the image up to Amazon ECR:

    podman push 209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo:latest
  3. View the image digest locally:

    podman image list 209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo:latest --digests
    
    REPOSITORY                                                        TAG         DIGEST                                                                   IMAGE ID      CREATED         
    SIZE
    209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo       latest      sha256:b578f272c7a9acef26188559c7c98af971d8724f320600e0a87042d01688f3fa  98aee5b857a9  47 minutes ago     262 MB
    podman image inspect 209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo:latest 
    [
         {
              "Id": "98aee5b857a91e9e89971cfbd34a5441cbd291f3245c96e807185ba1d0b62201",
              "Digest": "sha256:b578f272c7a9acef26188559c7c98af971d8724f320600e0a87042d01688f3fa",
     [TRIMMED]
  4. Get the digest of the uploaded image according to Amazon ECR:

    aws ecr describe-images --repository-name node/node-demo 
    {
        "imageDetails": [
            {
                "registryId": "209640446841",
                "repositoryName": "node/node-demo",
                "imageDigest": "sha256:74ea9e8924245a010fd254694b6c7667daab06fee986ac7b29960b2110cdaa6d",
                "imageTags": [
                    "latest"
                ],
                "imageSizeInBytes": 86209921,
                "imagePushedAt": "2022-06-29T17:06:39-04:00",
                "imageManifestMediaType": "application/vnd.oci.image.manifest.v1+json",
                "artifactMediaType": "application/vnd.oci.image.config.v1+json"
            }
        ]
    }

Describe the results you received:

According to podman locally the image digest is: sha256:b578f272c7a9acef26188559c7c98af971d8724f320600e0a87042d01688f3fa After pushing this image, according to Amazon ECR the image digest SHA is: sha256:74ea9e8924245a010fd254694b6c7667daab06fee986ac7b29960b2110cdaa6d

Describe the results you expected:

I would expect that the SHA digest would be the same locally as it is on Amazon ECR. Note that when using out of the box Docker Desktop as the image builder both the local image's digest and Amazon ECR end up with the same SHA digest.

Additional information you deem important (e.g. issue happens only occasionally):

After running podman pull and then podman inspect 209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo when running podman inspect 209640446841.dkr.ecr.us-east-2.amazonaws.com/node/node-demo the .RepoDigests[0] entry does match with Amazon ECR's digest. Perhaps Docker Desktop is taking this remote digest value and applying it to the local image digest value upon push?

Output of podman version:

Client:       Podman Engine
Version:      4.1.0
API Version:  4.1.0
Go Version:   go1.18.1
Built:        Thu May  5 16:07:47 2022
OS/Arch:      darwin/amd64

Server:       Podman Engine
Version:      4.1.0
API Version:  4.1.0
Go Version:   go1.18
Built:        Fri May  6 12:15:54 2022
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.26.1
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpuUtilization:
    idlePercent: 99.81
    systemPercent: 0.17
    userPercent: 0.02
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 664058116
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 5.17.5-300.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 404905984
  memTotal: 2066817024
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.4-1.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.4
      commit: 6521fcc5806f20f6187eb933f9f45130c86da230
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/664058116/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 361h 37m 20.28s (Approximately 15.04 days)
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 3
    stopped: 2
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106825756672
  graphRootUsed: 5120102400
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 33
  runRoot: /run/user/664058116/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.1.0
  Built: 1651853754
  BuiltTime: Fri May  6 12:15:54 2022
  GitCommit: ""
  GoVersion: go1.18
  Os: linux
  OsArch: linux/amd64
  Version: 4.1.0

Package info (e.g. output of rpm -q podman or apt list podman):

brew info podman
podman: stable 4.1.1 (bottled), HEAD
Tool for managing OCI containers and pods
https://podman.io/
/usr/local/Cellar/podman/4.1.0 (174 files, 47.7MB) *
  Poured from bottle on 2022-06-14 at 15:16:17
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/podman.rb
License: Apache-2.0

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes Additional environment details (AWS, VirtualBox, physical, etc.):

Mac OS X, Podman installed via brew, using an AWS account to test upload to Amazon ECR

flouthoc commented 2 years ago

Hi @nathanpeck Thanks for creating the issues.

To me issue looks duplicate of https://github.com/containers/buildah/issues/3866 and podman push --digestfile should be useful for such use-cases.

nathanpeck commented 2 years ago

Thanks @flouthoc. For more context (which I should have put in the first post) I am helping the team building AWS Copilot which is our official command line tool for building, pushing, and deploying images on AWS.

We have some customer requests to support Podman. Unfortunately right now our tool does a docker build then docker push and then uses the docker inspect command to grab the image digest of the locally built image. It then uses that image digest as the "canonical" reference to the image when launching it in AWS App Runner or AWS Elastic Container Service. This works great with Docker Desktop because the local digest is the same as the remote digest, but the flow breaks when switching to Podman because the digests don't match up between local and the remote registry.

Our goal is to figure out a workaround where Podman can function as a drop-in replacement for Docker within our tooling, and right now this digest behavior difference is a blocker. It looks like right now the only way to fix this is to have our tooling build and push, then delete the local Docker image that was built by Podman and then repull off the registry to get the "real" digest. Or stop using the local digest value entirely and only rely on the remote digest from ECR.

It looks in the linked issue like --digestfile won't help because they say "the --digestfile argument, prints out the 'unuseable' digest, so is not very helpful in the instance".

flouthoc commented 2 years ago

@nathanpeck Thanks for the detailed explanation i think a solution decided here https://github.com/containers/buildah/issues/3866 could be useful for your use-case where proposed solution is that on push the remote digest will be added to local storage so that it can be useful for regular podman commands.

izderadicka commented 2 years ago

Same problem with podman 4.1.1 and dockerhub, similar use case - need to have unique canonical reference to image after it's pushed to dockerhub registry

rhatdan commented 2 years ago

@mtrmac @vrothberg PTAL

mtrmac commented 2 years ago

Everything @flouthoc said is accurate.

izderadicka commented 2 years ago

Actually --digesfile finally did worked for me. But it was just learning scenario. And it really feels weird that calculation of "digest" is so unstable. If I'll need to use unique reference for several different registries then I it can get pretty messy.

mtrmac commented 2 years ago

That’s the trade-off: the digest is much more useful for security / attribution when it includes the compression representation, but that requires care while designing a workflow to preserve that value.

If there are multiple registries, it would almost always be better to podman push only once (compressing the data only once, which is quite time-consuming), and then use skopeo copy --preserve-digests to distribute the created image.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

ergleb78 commented 1 year ago

I'm facing the same problem. The only workaround I found (except going to registry and figuring out there) is to pull image from registry and then check SHA256. Interesting thing - after pulling image it has 2 different SHAs when I run inspect

vrothberg commented 1 year ago

@mtrmac should we think of a podman push --preserve-digests?

mtrmac commented 1 year ago

No, that would force us to push an uncompressed image, which users typically don’t want.


podman push --digestfile is the right approach, and I think the only right approach. An image does not have a single digest, it may have an unlimited number of different representations, so podman push && podman inspect is fundamentally incorrect.

Consider that the image may be pushed to registries with different capabilities (e.g. an OCI image might be pushed to a non-OCI registry and might need to be converted), or it may be differently compressed (to one registry with zstd, to one with gzip), or there may be different gzip-compressed versions of the same layer (yes that happens in practice), and when pushing to two registries, registry A might already contain one version, and registry B might already contain another version; in that case podman push, if it knows about those pre-existing layers, uses those pre-existing layers instead an expensive compress+upload.

So it is perfectly possible that

podman push --digestfile digest1 registry1.example.com/repo
podman push --digestfile digest2 registry2.example.com/repo

results in two different digits in digest1 and digest2 (even without any extra options, but also with extra options like --format). Which one of these digests should podman inspect show in the “digest” column? Would that potentially change after each push? How does any of this work reliably with multiple concurrent users / scripts creating images on the same computer?

Just use --digestfile. Unfortunately that might result in a digest that is not resolvable locally (as tracked in https://github.com/containers/buildah/issues/3866 ) but it does reliably work for finding the image on the registry where it was pushed to.

nathanpeck commented 1 year ago

I don’t see this as a case of whether it’s “fundamentally incorrect” but rather a case of whether you want to be drop in compatible with Docker. If you want to be drop in compatible with Docker then you have to fix this behavior to match Docker. If you say that we need to add a flag and use a digest file workaround then that’s not Docker compatible, which is okay as long as that’s what you are going for with podman.

Fundamentally incorrect or not, this issue is asking for the default behavior to match Docker CLI

On Mon, May 15, 2023 at 3:02 PM Miloslav Trmač @.***> wrote:

No, that would force us to push an uncompressed image, which users typically don’t want.

podman push --digestfile is the right approach, and I think the only right approach. An image does not have a single digest, it may have an unlimited number of different representations, so podman push && podman inspect is fundamentally incorrect.

Consider that the image may be pushed to registries with different capabilities (e.g. an OCI image might be pushed to a non-OCI registry and might need to be converted), or it may be differently compressed (to one registry with zstd, to one with gzip), or there may be different gzip-compressed versions of the same layer (yes that happens in practice), and when pushing to two registries, registry A might already contain one version, and registry B might already contain another version; in that case podman push, if it knows about those pre-existing layers, uses those pre-existing layers instead an expensive compress+upload.

So it is perfectly possible that

podman push --digestfile digest1 registry1.example.com/repo podman push --digestfile digest2 registry2.example.com/repo

results in two different digits in digest1 and digest2 (even without any extra options, but also with extra options like --format). Which one of these digests should podman inspect show in the “digest” column? Would that potentially change after each push? How does any of this work reliably with multiple concurrent users / scripts creating images on the same computer?

Just use --digestfile. Unfortunately that might result in a digest that is not resolvable locally as tracked in containers/buildah#3866 https://github.com/containers/buildah/issues/3866 ) but it does reliably work for finding the image on the registry where it was pushed to.

— Reply to this email directly, view it on GitHub https://github.com/containers/podman/issues/14779#issuecomment-1548405346, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2VN6R4EDWLV5G3U4SH7Z3XGJ4U3ANCNFSM52HGDTSQ . You are receiving this because you were mentioned.Message ID: @.***>

mtrmac commented 1 year ago

(It’s ambiguous on Docker as well, unless that whole machine (VM, not just a container, except for Docker-in-Docker nesting), is dedicated to that push and can guarantee “the same” image won’t be pushed, or pulled, by any other user / process. docker image list can easily list multiple entries with multiple digests, possibly for the same repo, with no way to tell which one was just pushed. See below how there can be several digest values for “the same” image after a push, with no way to disambiguate.)


I guess that’s a fair point. Actually the issue seems to be larger.

Currently (https://github.com/containers/podman/blob/4c399fc6fb2de70e0a197cd81fc3978d586e23ee/cmd/podman/images/list.go#L221-L248 ) , Podman lists one line per tag, and one line per all untagged references, with a .Digest set to c/storage.Image.Digest, which is set at the time of image creation, and ~never modified, and the same for all lines.

Whereas Docker (https://github.com/docker/cli/blob/935df5a59f59d5ee9291627be1b80b3cc7ad2b7e/cli/command/formatter/image.go#L113 ) lists one line per each RepoDigest (possibly listing the same digest for multiple tags in the same repo).

So one part is that just that the behavior is clearly different here. I don’t know whether we can / want to change that.

Another is that RepoDigests is not tracked accurately in Podman, IIRC. And yet another, and a specific subset of that, is https://github.com/containers/buildah/issues/3866 , where setting RepoDigests to include the just-pushed image is a bit of an implementation effort.

rhatdan commented 1 year ago

@mtrmac is this issue still active?

mtrmac commented 1 year ago

@rhatdan The comment just above points at code. Did that code change?

rhatdan commented 1 year ago

Not that I know of, but if this issue can not be fixed should we close it?

mtrmac commented 1 year ago

Nothing in the above indicates it “can not be fixed”. A lot of that is just work that has not happened.

Also, Podman maintainers have a decision to make, whether Podman should change how images with multiple digests are presented in the UI.