aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.18k stars 314 forks source link

[ECR] [Tags]: Exceptions to Tag Immutability (e.g. latest) #878

Open mike-stewart opened 4 years ago

mike-stewart commented 4 years ago

Community Note

Tell us about your request

Add the ability to specify a whitelist (or regex/pattern) of tags to specifically exclude from the tag immutability feature introduced in #169.

Which service(s) is this request for? ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Tag immutability has a lot of benefits for reliability, security, and compliance; however, it breaks certain workflows. For example, the common use of the latest tag.

Although using latest in production is an anti-pattern, there are multiple valid uses such as to use it in CI/CD builds as a Docker cache, or for dev/test workflows such as local development.

Ideally it would be possible to get the benefits of tag immutability in production without needing to sacrifice the flexibility of having some tags remain mutable.

Are you currently working around this issue?

As far as I know there is not a satisfactory workaround to this issue. One solution might be to maintain two registries (one with immutability, one without), but that would double storage costs and other overhead.

Another possibility discussed in #169 is to programmatically untag prior to pushing a new tag, but that would introduce a race condition whereby the tag is not available between untagging and pushing. If latest is being used in builds, for example, this would be a problem.

Additional context

There is some further discussion in #169.

whereisaaron commented 4 years ago

This would be handy. Right now we disable immutability if any tag at all might need to be updated.

While we dropped all use of latest we do often use 'series versions', which is a common pattern for shared or public images. So will each build has a unique tag

The newest version of 1.2.x is also tagged 1.2. This gives the consumer the options to always pull/check 1.2. or to lock to 1.2.11 and cache that image.

So if we could set a multiple mask regex of \d+\.\d+, we could support this use case.

spyoungtech commented 4 years ago

This would be amazing for us, especially for use cases beyond latest.

For example, our CI/CD system tags our images in a predefined manner and we'd like only those tags to be strictly immutable, while allowing users to add/mutate any other tags they'd like.

frctnlss commented 3 years ago

As others have said, the ability to retag on some versions and no on others would be amazing. It would also be amazing to be able to set the number of tags required when creating a new image. Using the example above where there is 1.2 which would be tagged on 1.2.x we would want to require a minimum of 2 tags. Other examples such as official language images also include base image and base version so they would set something close to 7 tags on every new image with only the most specific of versions being restricted. But then you could say that having a regex map would allow for a mix of rigidity and flexibility.

billinghamj commented 3 years ago

Tag immutability is also an issue with multi-arch image manifests. If you push a multi-arch image with Docker buildx, it will fail when immutability is enabled :(

shomeprasanjit commented 3 years ago

Any updates on this.. we are facing the same issue where we are using concourse pipeline with semver resource. Semver behind the scene implements "latest" tag by default along with the versioning bumps.

The bottom line is we want to make ECR repo immutable expect latest tag.

spyoungtech commented 3 years ago

Some thoughts on workarounds:

You might be to subscribe a lambda function to ECR push events to keep track of preexisting tags and undo tag pushes selectively.

As a pseudo example:

def on_event(event, context):
    tag = event['detail']['image-tag']
    repository = event['detail']['repository-name']
    digest = event['detail']['image-digest']

    existing_tags = get_existing_tags(repository)
    # check if a tag has been overwritten by this push event
    if tag != 'latest' and tag in existing_tags:
        # revert the change using our existing records
        previous_image_digest_for_tag = existing_tags[tag].digest
        tag_image(previous_image_digest_for_tag, tag)
        remove_if_untagged(repository, digest) #  optional; remove image if no other tags were pushed
    else:  # the tag is new or 'latest'
        # just record this for future enforcement
        update_existing_tags(repository, tag, digest)
    return

Although, this still has a similar race condition as the other described workaround, except it only occurs in the case of a non-compliant tag being pushed: image pulls between the non-compliant push completing and the lambda response placing the tag back where it 'belongs' would potentially result in the non-compliant image being pulled. In principle, however, pulls of a non-compliant digest could be audited.

The untag/push workaround could probably work better if, for example, you push the layers under a different unique tag (or no tag?), then untag, then push the tag for the new image. Because the layers will exist, the push should happen quickly, reducing the timeframe in which the tag does not exist. However, this workaround requires clients to follow this workflow, whereas the lambda event response workaround shouldn't require any change to client-side behavior.

efenderbosch commented 2 years ago

Another possible workaround is sorting by imagePushedAt and treating that as latest.

latestTag=$(aws ecr describe-images --repository-name myrepo --query "imageDetails[*].{imageTag: imageTags[0], imagePushedAt: imagePushedAt}" | jq --raw-output 'sort_by(.imagePushedAt)[-1].imageTag')
# and use it later
latestFooBarLabel=$(aws ecr batch-get-image --repository-name myrepo --image-id imageTag="${latestTag}" --accepted-media-types "application/vnd.docker.distribution.manifest.v1+json" --query 'images[0].imageManifest'  | jq -r | jq '.history[0].v1Compatibility' | jq -r | jq '.config.Labels["foo.bar"]')

NB: This will only work if the timezone of imagePushedAt is the same for each image. I don't know how that is formatted, maybe someone can clear that up? Uses the offset of the region, maybe?

morepe commented 2 years ago

Would also find this quite useful. Is there any update on this issue?

taylorsmithgg commented 2 years ago

+1 This is a quintessential part of the development process with CD tools. We need the ability to have exceptions for various branch tags, PRs, etc. so that we have reliable automatic deployments.

jrista commented 2 years ago

Any word on this feature? There are many exceptions to the rule, such as latest or vM and vM.m versions, which often need to be bumped to match the latest vM.m.r whenever a new image is created. It is great to lock down the vM.m.r tags with immutability, but some tags just need to be dynamic....

joshuabalduff commented 2 years ago

Any word on this feature? (v2)

DeadlyChambers commented 1 year ago

This would be really nice to have. If I can't change a tag, can I just delete the latest tag every time I'm creating a new latest?

taylorsmithgg commented 1 year ago

This would be really nice to have. If I can't change a tag, can I just delete the latest tag every time I'm creating a new latest?

You can, but you'd have to script it into your automation

BroderPeters commented 1 year ago

This would be really nice to have. If I can't change a tag, can I just delete the latest tag every time I'm creating a new latest?

You can, but you'd have to script it into your automation

Just leaving my solution here. In terms of codebuild it feels like the cleanest way to go. (from my terraform yaml buildspec)

  post_build:
    commands:
      - echo Removing latest tag from previous image
      - aws ecr batch-delete-image --repository-name ${image_name} --image-ids imageTag=latest
      - echo Pushing the Docker images...
      - docker push $REPOSITORY_URI:latest
jdrydn commented 1 year ago

@BroderPeters Excellent snippet, thank you!

Just putting my solution workaround here too:

- docker build -t build:dev .
- docker tag build:dev $ECR_REPO_URL:$COMMIT && docker tag build:dev $ECR_REPO_URL:latest
- docker push $ECR_REPO_URL:$COMMIT
- aws ecr batch-delete-image --repository-name $ECR_REPO_NAME --image-ids imageTag=latest
- docker push $ECR_REPO_URL:latest

This way, if there's an unexpected issue (typically permissions) the script will either fail:

  1. Pushing the commit tag to ECR.
  2. Or deleting the latest tag in ECR.
  3. Or pushing the latest tag to ECR.
jlbutler commented 1 year ago

Hi all, I don't see a response on this issue from ECR. Apologies for that, retroactively.

I'm inclined to think that @jdrydn 's solution is a good approach. It doesn't solve for the ask here, but it does embrace the intent of the feature. I'm generally concerned about working around immutability in a repo ad hoc, but don't have any specific concern beyond that.

That said, there are a lot of upvotes on this issue, so obviously it's interesting to customers. Reading back, there are some use cases where I would think using secondary repositories would suffice. But the automation integrations feel like the main use case where this feature is really needed. It seems wasteful to do all of the build workflows in one mutable repo, and then copy final images to an immutable repo. I'll bring this into the team for discussion, focused on this use case.

If anyone has further thoughts, please share here. Otherwise there is plenty to work with. And again apologies for being late to the party. Thanks all!

spyoungtech commented 1 year ago

A problem we ran into with deleting the latest tag in order to push again is that, if your production services pull from the :latest tag, you can get scaling or other failures in ECS/EKS in the time period between deleting the image and pushing the new image, particularly for large images, or cases where the push fails in CI and the image has to be rebuilt and pushed again.

But I suppose that is largely mitigated by pushing with a different tag first (something we didn't do).

jdrydn commented 1 year ago

@spyoungtech I've updated my reply to name it a "workaround" instead of a solution - and yes, pushing an alternative tag (e.g. 1.0.0) mitigates some of the problem you faced, especially around larger images.

The "correct" solution here IMO would be built-in ECR support to allow repeat pushes to a set of allowed tag names (e.g ["latest", "beta"]) when immutability is enabled, to avoid the missing image errors you described there. For ECS/EKS there's retries & backoffs, but for AWS Lambda it's just a straight-up error served as responses, less than ideal.

duxing commented 1 year ago

would be nice to have ECR natively support a collection of mutability exclusion tags.

irl-segfault commented 1 year ago

+1, latest should always be mutable

cs-dww commented 11 months ago

@jlbutler --> Best practice would be to allow a set of mutable names OR at least allow latest to be mutable. Latest should always be mutable.

Nice workaround @jdrydn . For our case, we need latest mutable as well.

NeelDigonto commented 9 months ago

@jlbutler

Another way to do it as mentioned in https://repost.aws/questions/QUN4A5R47CTcGifcCr6mWCnQ/ecr-image-tagging.

Temporarily disable tag immutability, build & upload the container, then re-enable tag immutability.

aws ecr put-image-tag-mutability --repository-name name --image-tag-mutability MUTABLE --region region docker build -t $ECR_REPO_URL:latest . docker push $ECR_REPO_URL:latest aws ecr put-image-tag-mutability --repository-name name --image-tag-mutability IMMUTABLE --region region

This should cause less failures while pulling images compared to the workaround with deleting the image tag. But on the other hand leave the repo vulnerable for a short duration of time, also introduce this vulnerablity when it's most critical.

Until the requested feature drops, which approach do you feel is more preferable?

muellerc commented 5 months ago

+1, latest should always be mutable

kamzil commented 4 months ago

Not everyone uses only "latest" as a reusable tag. We're using environment names, e.g. dev and test, to label images based on the environment they're currently deployed to. So would be nice to be able to define a set of mutable tags as exception to the immutability rule.

mi-laf commented 2 months ago

We'd also love to enable immutability, also because it otherwise pops up in Security Hub (https://docs.aws.amazon.com/securityhub/latest/userguide/ecr-controls.html#ecr-2).

There have been multiple suggestions from checking specifically for latest to providing a configuration option to specify a set of mutable tags or to specify a regex.

Always (automatically) excluding latest from immutability may seem intuitive, but it's not flexible enough and introducing that change now would be a breaking change.

I'd prefer the regular expression. It supports mutable version tags such as 1.2 pointing 1.2.x (@whereisaaron explained this) or other dynamic mutable tags that are derived from e.g branch names (mentioned by @taylorsmithgg).

Furthermore, with a regular expression one can easily implement the other, simpler use cases like a fixed set of mutable tags or just matching any tag with "latest" in the name or really just latest.

Any news on this?

userhas404d commented 1 month ago

this would be super useful for implementing things like remote cache support in Amazon ECR for BuildKit clients