aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

[ECR] [console multi-arch support]: Support viewing multi-architecture images in the ECR Console UI correctly #1596

Open bencooper222 opened 2 years ago

bencooper222 commented 2 years ago

Community Note

Tell us about your request

Currently, if I were to push a multiple architecture build to ECR, it would show up in the console as multiple images. Let's say I pushed an image that was built for linux/amd64 and linux/arm64 and tagged it as "latest", three different images would show up in ECR

  1. the digest for linux/amd64 showing as <untagged>
  2. the digest for linux/arm64 showing as <untagged>
  3. The digest that points to the manifest that points to the images in 1/2 showing as latest

This is the correct behavior and pulling and interacting with these images via the docker CLI works exactly as expected. However, looking at the ECR UI is extremely confusing because it shows as 3 images. Displaying it as one image with multiple architecture would be less confusing and would bring ECR to feature parity with the official docker repository.

Which service(s) is this request for? ECR's Console

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? It's extremely confusing and looks like we accidentally published 3 images (or worse, pushed images but didn't tag them). It shouldn't take knowledge of Docker's implementation of mult-arch images to understand the ECR UI.

Note that this is more important than it used to be because of the Apple Silicon MBPs that were recently released - we're transitioning to publishing more images with multiple architectures and I imagine others are as well.

Are you currently working around this issue? It's mostly a messaging problem - we have to explain this behavioral quirk as an issue with the UI. It'd be nice if we didn't have to.

clouddev-code commented 2 years ago

I also want to watch image architecture info on console. Now image architecture info only watch the following cli command.

% aws ecr batch-get-image --repository-name aws-for-fluent-bit --registry-id 906394416424 --image-ids imageTag=stable --query 'images[].imageManifest'

Fydon commented 1 year ago

The console makes it seem that the "untagged" images will be using the untagged lifecycle policy, even though they are tagged. I hope this is not the case

jledoux-sonergia commented 1 year ago

Same issue here, and it does causes issues with lifecycle policies. For instance I've had set a policy to only keep the 5 latest images in a repository where I frequently push 4 different versions (node 12/14/16/18), but since one image pushed gets splitted in 3 in ECR (one image index plus 2 images, having one with 0MB), the policy removed 2 versions during the night :cry: .

To workaround this I've had to, re-build and re-push the deleted images, and increase the rule imageCountMoreThan to 12 in every repositories... also I can't set rules to remove untagged image anymore since I'm afraid it might delete one of these untagged splitted images.

image

Is there any way to fix this behavior to have only one image again ?

lockwobr commented 1 year ago

This post makes me think that it's a ui issue, doing a little testing myself, really don't want have untagged images, but since we started using multi-arch buildx build process it's becoming an open question myself how to manage images.

rlaiola commented 1 year ago

The very same issue happens when building and pushing multi-arc images with GitHub Actions to ghcr.io (GitHub Container Registry). The problem is compound since untagged images (for different platforms) are not removed/replaced if I rebuild the image with the same tag.

bencooper222 commented 1 year ago

This post makes me think that it's a ui issue, doing a little testing myself, really don't want have untagged images, but since we started using multi-arch buildx build process it's becoming an open question myself how to manage images.

Yes, as specified in the original post, it is a UI issue and one AWS should fix.

The very same issue happens when building and pushing multi-arc images with GitHub Actions to ghcr.io (GitHub Container Registry). The problem is compound since untagged images (for different platforms) are not removed/replaced if I rebuild the image with the same tag.

The same issue does not occur on GHCR, notice the OS/arch tab on this page.

The problem is compound since untagged images (for different platforms) are not removed/replaced if I rebuild the image with the same tag.

I'm not sure how GHCR handles this but in ECR if you have a single-arch image tagged as latest and then push a different image with the tag latest, the old image will still be in your ECR as <untagged> (unless you have a lifecycle policy or some sort of cron to delete them). Multi-arch does not change that behavior.

rlaiola commented 1 year ago

The very same issue happens when building and pushing multi-arc images with GitHub Actions to ghcr.io (GitHub Container Registry). The problem is compound since untagged images (for different platforms) are not removed/replaced if I rebuild the image with the same tag.

The same issue does not occur on GHCR, notice the OS/arch tab on this page.

@bencooper222 I might have misunderstood the issue then. Looking at the OS/arch tab on this page indeed the created images are listed for each architecture.

Screen Shot 2023-05-11 at 14 50 13

However, the image highlighted above (and many others, not to say all multi-arch but one) are still considered untagged. In order to check that view the untagged tab on the All versions page.

Screen Shot 2023-05-11 at 15 03 14

The problem is compound since untagged images (for different platforms) are not removed/replaced if I rebuild the image with the same tag.

I'm not sure how GHCR handles this but in ECR if you have a single-arch image tagged as latest and then push a different image with the tag latest, the old image will still be in your ECR as <untagged> (unless you have a lifecycle policy or some sort of cron to delete them). Multi-arch does not change that behavior.

That is the very same behavior on GHCR. To showcase that, you can take a look here. Every time I run the multi-arch build workflow (for instance, for debugging branch changes) new images are created and the old ones are not removed automatically. Even though it is a multi-arc build the tag (e.g., latest) seems to be directly linked to only one of the images (not all multi-arc).

Hope it helps clarifying and please let me know if it is related to the reported issue.

Cheers!

bencooper222 commented 1 year ago

The way multi-arch images work is there is a manifest that is tagged with latest or whatever your tag is that contains a JSON that points to an untagged image for each architecture. That means uploading an image for 2 architectures will create three images:

  1. tagged with latest and containing a manifest (but no actual image layers) that points to digest A and digest B
  2. an untagged image with digest A for architecture A
  3. an untagged image with digest B for architecture B

The point of image 2 and 3 is to be unfindable except via the manifest from image 1.

You can see the manifest if you have docker on your machine by running docker buildx imagetools inspect ghcr.io/runatlantis/atlantis:latest

hobti01 commented 1 year ago

Same issue here, and it does causes issues with lifecycle policies. For instance I've had set a policy to only keep the 5 latest images in a repository where I frequently push 4 different versions (node 12/14/16/18), but since one image pushed gets splitted in 3 in ECR (one image index plus 2 images, having one with 0MB), the policy removed 2 versions during the night 😢 .

Could someone from AWS comment if there a need for another bug with the Lifecycle Policy treating multi-arch images as untagged? This seems like a more significant bug that displaying correctly in the UI :)

jlbutler commented 1 year ago

Hi all. This issue came to my attention today, apologies for not getting a response to you from the service team sooner.

I think the thread here is speaking to a number of concerns. I'll share some thoughts, which hopefully will respond to the initial suggestion, and allay some concerns that came up in the subsequent thread.

  1. The way multiplatform images are visualized in the ECR console could be improved

This is the initial issue, and I agree. We are working on some enhancements in the console to improve overall visualization workflows, this would be a good one to consider once those changes land. Going to set this under consideration, apologies for it taking so long.

  1. Concerns that LCP rules for untagged content will delete images underlying a tagged multiplatform images

This should not be a concern. As you can see with the eventual response on the linked issue above, if untagged content is referred to in a manifest list (OCI index), LCP will not delete that content.

@jledoux-sonergia apologies I didn't notice your comment sooner - a workable solution would be to set up a rule for tagged content. The untagged content (underlying images) would not be deleted.

All this said, there's no rule that the underlying images cannot be tagged themselves. It's just the default behavior of some clients to push them untagged.

Hopefully this is helpful. If anyone has further concerns about LCP, please loop back here. Thanks!.

lorenzwalthert commented 1 year ago

I don't see untagged images, but tag - in my ECR. I created a reproducible example:

Dockerfile

FROM python:3.11.4

RUN echo 'hi there'

build script


aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/g0m1u8n2

docker buildx build --push --platform="linux/amd64,linux/arm64" -t public.ecr.aws/g0m1u8n2/python:3.11.4 .

Interestingly enough, in the public gallery it is displayed fine, i.e. just one tag: https://gallery.ecr.aws/g0m1u8n2/python. But in my console, I see - as tags:

Screenshot 2023-08-22 at 15 03 41
jlbutler commented 1 year ago

I don't see untagged images, but tag - in my ECR.

This is just how we represent untagged content in the console. The column is the value of the tag. We used to have the untagged label on all of these, which many customers implied looked like an issue or something they needed to address. So the simple - is to indicate there is no tag to show.

Interestingly enough, in the public gallery it is displayed fine

This content is a multiplatform image, and using buildx to automatically build and push such an image results in your desired tag landing on the manifest list (OCI Image Index), and the underlying referred images being untagged, and referred to by digest.

I'd say this is displayed correctly in both places, it's just a different visualization. The Public Gallery only shows you the tagged index, whereas in ECR we show you all content, including the untagged images.

johnjeffers commented 1 year ago

@jlbutler I agree that it's good to be able to see the untagged images but it's a lot of visual clutter when most of the time you just want the tag. It'd be nicer imo to have the default view show the tags, and then allow you to expand for more detail if you want to see the related untagged images.

This is a pretty low signal:noise for a single multi-arch image, no? ecr

Something like a disclosure triangle, a show related images button, or similar would be great.

robwilkerson commented 1 year ago

I have all the same signal-to-noise concerns as others, but I'm also seeing that, although I only build for 2 platforms, I get shown 4 untagged images, 2 of which indicate failures.

Screenshot 2023-09-12 at 4 10 27 PM

I see corresponding info when I inspect the manifest:

docker manifest inspect REDACTED.dkr.ecr.us-east-1.amazonaws.com/platinum-image-data
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.oci.image.index.v1+json",
   "manifests": [
      {
         "mediaType": "application/vnd.oci.image.manifest.v1+json",
         "size": 5224,
         "digest": "sha256:421b425f198c0aa2ef2cbaf40ccd3f...",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.oci.image.manifest.v1+json",
         "size": 5224,
         "digest": "sha256:db1c1de4dbed418b427a9b8d1dfeaec...",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.oci.image.manifest.v1+json",
         "size": 566,
         "digest": "sha256:d4c4c8ebfa8b7b0902aacb043...",
         "platform": {
            "architecture": "unknown",
            "os": "unknown"
         }
      },
      {
         "mediaType": "application/vnd.oci.image.manifest.v1+json",
         "size": 566,
         "digest": "sha256:05b70deec55687b92d52f45f4c...",
         "platform": {
            "architecture": "unknown",
            "os": "unknown"
         }
      }
   ]
}

Note that the last 2 items indicate an unknown architecture and OS, presumably aligning with the faile images in the UI list?

sylwit commented 1 year ago

@robwilkerson are you building your images with github actions https://github.com/docker/build-push-action. ? I had the same issue and comes out you need to specify provenance = 'false' to remove those manifests.

robwilkerson commented 1 year ago

@sylwit - We use TravisCI and basically use buildx to push+tag. I'll look into the provenance value a bit to see whether there's anything there that I can leverage in my mechanism.

jledoux-sonergia commented 1 year ago

@robwilkerson are you building your images with github actions https://github.com/docker/build-push-action. ? I had the same issue and comes out you need to specify provenance = 'false' to remove those manifests.

I've tried to add this option in a github workflow using docker/build-push-action, but I don't see any differences on ECR side, untagged images still shows up

sylwit commented 1 year ago

@jledoux-sonergia assuming you're building for 2 platforms, you'll see 2 untagged images instead of 4 because this won't push the provenance manifests. You won't see images with a size of 0.

jledoux-sonergia commented 1 year ago

@sylwit I'm actually building for one platform linux/amd64, and I still get one 0 size image with the provenance=false option image

Here is the relevant part of the workflow step:

      - name: Build and push Docker images
        id: build-image
        uses: docker/build-push-action@v2
        with:
          provenance: false
          context: .
          file: Dockerfile
          platforms: linux/amd64

Should I quote the option value provenance: 'false'?

jledoux-sonergia commented 1 year ago

@sylwit just so you and other people know, this option does not remove 0 size images on ECR (even if its value is quote as a string).

Provenance is not just a GitHub action option, but a native Docker build option https://docs.docker.com/build/attestations/slsa-provenance/

robwilkerson commented 1 year ago

Adding provenance=false to my buildx command did exactly what I hoped! No more zero byte images, only the 2 image for the platforms that I'm explicitly building. I'd still prefer if it were visualized differently, but this is better than what I had:

docker buildx build -f "${script_dir}/Dockerfile.${project_slug}" \
                    -t "${aws_repo_uri}:${new_tag}" \
                    -t "${aws_repo_uri}:latest" \
                    --no-cache \
                    --platform="${platforms}" \
                    --provenance=false \
                    --push \
                    .
danielcompton commented 6 months ago

This should not be a concern. As you can see with the eventual response on the linked issue above, if untagged content is referred to in a manifest list (OCI index), LCP will not delete that content.

@jlbutler I'm seeing the opposite problem. I want to expire a tag prefix (which was set on a manifest) and expected that all of the images referenced by the manifest (untagged) would also be deleted. I checked today and saw that the manifest had been deleted, but all of the untagged images have not been deleted.

efenderbosch-atg commented 4 months ago

Just had a repo reach quota because lifecycle rules don't work as expected for multi-platform images. Adding an "untagged" rule now, but this shouldn't be necessary, or at least called out as required in documentation.

bruno-lanconi commented 4 months ago

Just had a repo reach quota because lifecycle rules don't work as expected for multi-platform images. Adding an "untagged" rule now, but this shouldn't be necessary, or at least called out as required in documentation.

We're facing the same issue right now. How could we deal with multi-platform images LCP without compromising another non-multi-platform images? If we have a scenarious where teams may use the same ECR to push those two untagged types of images, should we avoid using LCP and start using some sort of recurrent alternative LC process?

E.G.


import boto3
import json

# Configuration
repository_name = 'your-repo-name'
image_index_tag = 'your-image-index-tag'
region = 'your-region'

# Initialize ECR client
client = boto3.client('ecr', region_name=region)

def get_image_index_manifest():
    response = client.batch_get_image(
        repositoryName=repository_name,
        imageIds=[{'imageTag': image_index_tag}],
        acceptedMediaTypes=['application/vnd.docker.distribution.manifest.list.v2+json']
    )
    return response['images'][0]['imageManifest']

def extract_image_digests(manifest):
    manifest_json = json.loads(manifest)
    return [manifest['digest'] for manifest in manifest_json['manifests']]

def delete_images(digests):
    image_ids = [{'imageDigest': digest} for digest in digests]
    response = client.batch_delete_image(repositoryName=repository_name, imageIds=image_ids)
    return response

def main():
    # Retrieve the Image Index Manifest
    manifest = get_image_index_manifest()

    # Extract Image Digests
    digests = extract_image_digests(manifest)

    # Delete the Images
    if digests:
        response = delete_images(digests)
        print(f"Deleted images: {response}")
    else:
        print("No images found to delete.")

if __name__ == '__main__':
    main()
thomasvjoseph commented 3 months ago

` - name: Build, tag, and push image to Amazon ECR

    id: build-and-push
    uses: docker/build-push-action@v6
    with:
      context: .
      file: ./Dockerfile
      push: true
      platforms: linux/amd64
      provenance: false
      tags: ${{ steps.login-ecr.outputs.registry }}/test-cache:latest
      cache-from: type=registry,ref=${{ steps.login-ecr.outputs.registry }}/test-cache:cache
      cache-to: type=registry,mode=max,image-manifest=true,oci-mediatypes=true,ref=${{ steps.login-ecr.outputs.registry }}/test-cache:cache`

Adding provenance: false, is resolved the issue for me.

nicl-dev commented 2 months ago

Not only is this visual clutter, but it also makes identifying vulnerabilities in image scans extremely annoying. We are using pull through cache rules to make use of AWS Inspector, and our findings look like this:

CleanShot 2024-08-22 at 10 21 14@2x

As you can see, the impacted resources are all untagged because we pretty much only have multi-arch images in our caches. In its current state, using the AWS Inspector with ECR caches is not very user-friendly. I would love to see a fix for this.

laertispappas commented 1 day ago

This creates issues with aws inspector as @nicl-dev pointed, is there a recommended fix for this? We had to explcitly set provenance: false to make it work in docker/build-push-action as suggested above:

- name: Build Docker image
        uses: docker/build-push-action@v6
        with:
          context: "."
          file: ${{ inputs.dockerfile }}
          push: ${{ inputs.docker_push }}
          load: ${{ inputs.docker_load }}
          target: ${{ inputs.target }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          provenance: false
         .....