aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

[ECR] [request]: Support regular expression matching for tags in lifecycle policies #1213

Open ingshtrom opened 3 years ago

ingshtrom commented 3 years ago

Community Note

Tell us about your request What do you want us to build? I to use regular expressions to match on tags in my lifecycle policies. For example, we have an image tagged <account_id>.dkr.ecr.us-east-1.amazonaws.com/<image_name>:v1.1.0_test-4db22261f6a2ca5de2cb7eae3382dba32b3676da.

Right now, we need to tag the image as <account_id>.dkr.ecr.us-east-1.amazonaws.com/<image_name>:test-4db22261f6a2ca5de2cb7eae3382dba32b3676da_v1.1.0 so that we can use prefix matching in the lifecycle policy.

Which service(s) is this request for? ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We want to be able to have lifecycle policies that are more dynamic. Some things are in image tags like Git SHAs, Semver, etc. which are dynamic and we cannot match on dynamic strings in lifecycle policies.

Are you currently working around this issue? For existing images, we cannot change. For new images, we can start tagging with a prefix we can filter on to be able to match in our lifecycle policies.

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

Malokingi commented 3 years ago

This would be nice. The ability to use regular expressions instead of being restricted to just a "prefix" string for the tag filter on lifecycle policies, that is. I'm currently trying to maintain thousands of images and the way they're currently named, I need to filter by a prefix and a suffix, and a regex option would solve that problem and, probably, countless others.

In the meantime, I guess I'll just have to use lots and lots of ... uhhh, other commands. I'll figure something else out, I'm sure.

davidlukac commented 3 years ago

This would be great, we're using Java-Spring style versioning of releases (1.0.0.RELEASE) and currently migrating our Docker registry to ECR, which atm forces me to change the versioning to RELEASE-1.0.0 or something like that, so we can create lifecycle policies on how many version of certain release type we want to keep in the registry. Such version is not SemVer compatible (and looks terrible), which breaks other tooling in our pipeline.

caladev commented 3 years ago

I wish I could give this 10 thumbs up. When using AWS SAM with the image type for lambda functions, by default it creates an image tag with a combination of the name of the lambda + the dockertag metadata of the image. If your lambda is something like myfunction-5c9aba82d6c9-mydockertag then there's no way to have a lifecycle policy based on the mydockertag at the end unless we had something like regex to do that.

morepe commented 2 years ago

Another thump up for me. We have a on commit deployment with argocd. Each commit triggers a build that triggers a new image that is tagged with the commit hash. These images are only used very short as the tests are executed against it but there are tons of images now in ecr

artem-kosenko commented 2 years ago

it will be grate to have some "not equal" and/or "not content" patterns and parce all existed tags on the image. I'm OK to add an extra tag for images I need and delete all other withount this extra tags.

image [1.0.0-dev.123, 1.0.0-dev, dev] <-- keep images with "dev" tag image [1.0.0-dev.122] <-- delete all images without "dev" tag ect..

HenryYanTR commented 2 years ago

thumb up. does anyone actually put "dev" or "prod" in the front of the tag? the matching should support SemVer style tags.

MrMarkW commented 2 years ago

In order to reduce image bloat, we had to create our tags prefixed with an environment name like dev,stg,prd. It is not ideal at all. ECR also doesn't support deleting by how old or last pulled.

maherrj commented 2 years ago

Big thumbs up from me. We have tens of thousands of images across hundreds of repositories. And not having this required us to develop a custom solution. Its been painful and still is painful not having this.

maerzhase commented 1 year ago

word! ability to use prefix only is pretty limiting and offering regex seems reasonable. 🐰

jlbutler commented 1 year ago

Hi all! We're looking at making enhancements to LCP rules and we've been tracking this issue. As we start looking into it, we'd like to know if wildcards would meet most needs, or if a more complete regex experience is required.

Understood that prefixes are limited, but typically some sort of schema is used in tagging. It might be simpler to introduce wildcards, but we'd want to know that it would get customers a meaningful value. What do folks think?

vascoalramos commented 1 year ago

For our use case, wildcards are not enough, since we need full regex expressions to distinguish dev and prod tags since we only want to delete older dev tags and keep all prod tags. So, we would regex expressions like the following: Being prod tags like ^\d+.\d+.\d+ and dev like dev_\d+_^\d+.\d+.\d+ .

maherrj commented 1 year ago

Wildcards will work for some of our tag formats e.g. we have inflight release candidates which have SNAPSHOT in the name, so *SNAPSHOT* would work.

However, images with semantic version tags, are considered released images and therefore fall under separate lifecycle rules.

Elsewhere we have tooling which pushes images which we have been able to change to use prefixes.

So a mixed bag. Bottomline, wildcards wont work for us entirely, and regex would be a better fit.

elatt commented 1 year ago

I'd prefer a regex if possible. Currently we have to force a prefix into our tags to denote dev images but ideally we just use the git versioning scheme. So a prod version is just <major>.<minor>.<patch> and all dev releases (those that occur between our git tags) end up with a name like <major>.<minor>.<patch>-<number>-g<sha>.

MrMarkW commented 1 year ago

Please also support by age and since last pulled.

David3Ar commented 1 year ago

I think it is very important do change the way Lifecycle Policy rules work. They don't provide the same quality as the handling of other aws services.

But in general we can say that ECR lifecycle should allow to handle complex tagging strategies, while being able to be understandable and easy to set up. It seems like the whole functionality could need a very generous rework with some good features like f.e. sinceLastPulled, keep image.

Btw. The way of testing rules and apply them is a very good feature that also prevent accidentally data loss. This is something i want to especially praise!

maherrj commented 1 year ago

I also second the comment above around pruning based on last pull date.

Although, we have seen some bizarre behaviour of last pull date. Will reach out separately on this issue.

Cheers, Rich

maistrotoad commented 1 year ago

I don't think this has been put forward, but for me a useful regex scheme would have each match be treated individually.

My usecase is to be able to keep 1 tag for an image per pull request. E.g. a tag would have a regex prefix like pr-[0-9]{3,4} so if I have these 4 images

pr-001-somehash1
pr-001-somehash2
pr-002-otherhash1
pr-002-otherhash2

I then want to keep

pr-001-somehash2
pr-002-otherhash2

So the latest tag per regex match of the prefix and not end up with only pr-002-otherhash2

joaocfernandes commented 1 year ago

Hi all! We're looking at making enhancements to LCP rules and we've been tracking this issue. As we start looking into it, we'd like to know if wildcards would meet most needs, or if a more complete regex experience is required.

Understood that prefixes are limited, but typically some sort of schema is used in tagging. It might be simpler to introduce wildcards, but we'd want to know that it would get customers a meaningful value. What do folks think?

Hi 👋🏼

Wildcards would be a huge win comparing to prefixes. It would give me some additional flexibility.

Putting it in perspective: Assuming that last pulled and image age are supported. I would prefer to have a wildcard matching in 2 months than a regex matching in more than 1 year.

bwmills commented 1 year ago

This would definitely be great.

Our use case is for suffixes on tagged images. We often use SEMVER_env for applications that require an environment designator. Also some use cases for prefix and suffix logic on a per-image basis for tag matching.

Agree with @joaocfernandes

Wildcards would be a huge win comparing to prefixes [only ]...

Fwiw, one specific example is a busy frontend ECR repo where images need to be tagged with SEMVER_env. We already have a [cost-driven] single rule to control the max number of images kept over time. It would be great to add a second rule that gets applied first, where we match _prod images to ensure that the three latest production images are always kept, even with the max number of images rule getting applied.

grbljplat commented 1 year ago

Hi, the ability to enforce a Tag naming-convention (vN.N.N-env or whatever) on image upload is likely a fundamental requirement for most CICD/Pipeline-based build systems. The current lack of this feature on AWS ECR is a key differentiator ...- please implement this !!

PhoenixRe32 commented 1 year ago

A lot of nice ideas mentioned here but personally I would say if one were to go for an easy win with minimal effort regular expressions would be the one.

In my use case suffixes would suffice (I have never noticed prefix versioning personally so this drove me crazy :-) ) but I feel it is not too different to have wildcard and pattern matching (effort wise for this is an assumption) and the second is more complete as a solution.

So pattern matching for the win

barryib commented 1 year ago

Hello, in our case we use sementic versioning to tag our images. Since this is not supported yet LCP. we now have lot of old images to clean up. We'll probably build a lambda to do that on regular basis. This is not ideal at all, since we need to extra work/compute to handle it and will force to have images lifecycles in different tools (in ECR and in custom lambda).

@hsejour do you know if this is issue is planned, if yes, is there any ETA (a quarter or semester timeframe is enough).

HaroonSaid commented 1 year ago

We would love if AWS can solve the problem for all customers. We have repositories in lot of AWS regions replicated.

hobti01 commented 1 year ago

Hi all! We're looking at making enhancements to LCP rules and we've been tracking this issue. As we start looking into it, we'd like to know if wildcards would meet most needs, or if a more complete regex experience is required.

Understood that prefixes are limited, but typically some sort of schema is used in tagging. It might be simpler to introduce wildcards, but we'd want to know that it would get customers a meaningful value. What do folks think?

The typical tagging schema is semantic versioning. This means that prefix matching is simply inadequate since the distinction between release and build is in the suffix (or specifically the lack of a suffix). Based on our experience with the Harbor registry, where wildcards are available but regex is not - there are real-world use cases where wildcards are simply not adequate.

Even if use cases could be met with multiple wildcard rules, only 50 rules are allowed in each policy. Using regex would cover more use cases per rule and would help users stay within the rule quota.

carlosjgp commented 1 year ago

Please prioritise this ticket the ECR policies are almost useless as they are now for us

we have this awful Terraform code to build even like that when a project goes over v9. their images are deleted or the v1. if we adjust the loop

locals {
  semver_lifecycle_policy = {
    # X.Y.Z or vX.Y.Z... since AWS does not support regex here the rule is a good enough approach '1\.(.*)'...'999\.(.*)' and same for 'v1\.(.*)'...'v999\.(.*)'"
    # One lifecycle rule per major version because an image tag needs to match all prefixes in the list to be removed.
    for major in range(0, 10) :
    50 + major => {
      description = "Keep as many images tag with semver as possible",
      selection = {
        tagStatus = "tagged",
        # v.X.Y.Z or X.Y.Z semver
        tagPrefixList = [
          "${var.semver_prefix}${major}."
        ]
        countType   = "imageCountMoreThan",
        countNumber = var.max_image_count # There is a hard limit of 10000 images per repository
      },
      action = {
        type = "expire"
      }
    }
  }

  default_lifecycle_policy = merge(
    local.semver_lifecycle_policy,
    {
      # images tagged with prefix `sha-` will be considered testing images and will disappear after 1 day.
      60 = {
        description = "Keep not released images for 1 day",
        selection = {
          tagStatus     = "tagged",
          tagPrefixList = ["sha-"],
          countType     = "sinceImagePushed",
          countUnit     = "days",
          countNumber   = 1
        },
        action = {
          type = "expire"
        }
      },
      # Any images that have not been marked by higher priority rules will be expired.
      # This includes untagged images, images that have been tagged with a format not expected by any defined lifecycle rules.
      # See https://docs.aws.amazon.com/AmazonECR/latest/userguide/LifecyclePolicies.html#lifecycle-policy-howitworks
      70 = {
        description = "Images without expected tagging will be deleted",
        selection = {
          tagStatus   = "any",
          countType   = "imageCountMoreThan",
          countNumber = 1
        },
        action = {
          type = "expire"
        }
      }
  })
}

At the moment this is a pain and we would soon be building up our own script to manage the image lifecycle

bwmills commented 1 year ago

@ingshtrom Hi Alex, any updates on this?

Our need for this continues to grow - it would be incredibly helpful in managing a fairly large number of ECR repos in AWS.

Ah excuse me, I see the label changed to in progress a few days ago - great to see and thank you

OJOMB commented 1 year ago

also need this

jufemaiz commented 1 year ago

👀

arareko commented 1 year ago

@hsejour Can you provide a status/ETA on this? Thanks!

bwmills commented 1 year ago

No one is assigned?

Does anyone know the status?

rafavallina commented 1 year ago

Hi everyone. Just a quick notice from the ECR team as I notice that we went silent on this issue for quite a bit.

We continue to working on support for wildcards in lifecycle policies, and we plan release it before the end of the year! As you all are probably used to, I'm not committing to this, but I'm quite confident that it will be out there soon.

I have definitely heard that wildcards are not enough for everyone, and we want to continue working on improving LCPs, including RegEx support, SemVer support, and last date pulled. However, I do not have more to share about what will be done or when. Just note that feedback is being heard and we are using it to adjust our roadmap!

HaroonSaid commented 11 months ago

We were expecting a big announcement at Re:Invent.
Any more updates on feature development Look for approximate timelines to determine if it's 2024 or beyond

rafavallina commented 11 months ago

Hi all - wildcards for LCP are live. A "just-after-reinvent" launch!

https://aws.amazon.com/about-aws/whats-new/2023/12/amazon-elastic-container-registry-wildcards-lifecycle-policies/

I'm keeping this item open since the original ask is for regular expressions, which is more than wildcards. But I hope this will help some of you make progress! Thanks for your patience

vchirikov commented 9 months ago

@rafavallina / @HaroonSaid it doesn't work, I got the policy json from aws console and tried to use terraform provider version 5.33.0 and got creating ECR Lifecycle Policy (***): InvalidParameterException: Invalid parameter at 'LifecyclePolicyText' failed to satisfy constraint: 'Lifecycle policy validation failure: instance value ("tagged-wildcard") not found in enum (possible values: ["tagged","untagged","any"])

When I tried to use aws console to import the policy and got this: image

So wildcards doesn't work well (it worked only if you add rules one by one, not via json import)

rafavallina commented 9 months ago

@vchirikov Can you try to use 'tagged' in the 'tagStatus' field? That field is only to specify if the images must be tagged. You use the 'tagPatternList' field to indicate that we are doing a wildcard matching.

vchirikov commented 9 months ago

I checked network requests from UI and it uses tagged tagStatus, but if I view policy as json it shows as tagged-wildcard Proof: image

I'll try to use tagged tomorrow, thanks.

celorodovalho commented 7 months ago

Please, prioritize this issue.

typeBlkCofe commented 5 months ago

please prioritize this issue this should be like a few lines of code to enable regex filtering and this is open since 2021!

deanb-everc commented 3 months ago

please prioritize this issue, it's a real pain for us

pierluigilenoci commented 3 weeks ago

@tabern @ingshtrom @rpnguyen @hsejour I was thinking that 1387 days could be enough time to start scheduling the fix for a bug that has affected development of a feature requested by at least 481 users (to date).

pierluigilenoci commented 3 weeks ago

@joebowbeer @mikestef9 @maishsk @plan-do-break-fix FYI

elisiariocouto commented 3 weeks ago

I was thinking that 1387 days could be enough time to start scheduling the fix for a bug that has affected at least 481 users (to date).

What do you mean? You don't even need this! You can easily achieve this by using the following products:

/s

mieliespoor commented 3 weeks ago

What do you mean?... /s

And then $$$$$$$$$$$$$$$$$$$$$$$$$$

[and I did notice the /s]

pierluigilenoci commented 3 weeks ago

@elisiariocouto 🤣

However, given the recent activity on this repository, I fear that posting here is like sending a message in a bottle.

xaK-mc

pierluigilenoci commented 3 weeks ago

But maybe...

aaa8e0ea-62e0-4c97-be74-5f225861be6f_text

pierluigilenoci commented 3 weeks ago

I forgot to tag @rafavallina to get some insight! 🚀