aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.2k stars 316 forks source link

[ECR] [request]: add "sinceImagePulled" countType to ECR Lifecycle policy #921

Open mattmessinger opened 4 years ago

mattmessinger commented 4 years ago

Community Note

Tell us about your request Add a new sinceImagePulled countType to ECR Lifecycle policy.

Which service(s) is this request for? ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? I would like to create an ECR Lifecycle policy that is based on when an image was last pulled. I can use such policy to infer that if an image has not been pulled in the last N months then it is not being used and I can safely delete it.

Are you currently working around this issue? Right now we have to carefully track which images are still in use by our various build and deployed systems. This is error prone and leads to accidental deletion of images that are still being used.

git-pchauhan commented 4 years ago

๐Ÿ‘ Would love to see this getting prioritized (and am kinda surprised it's not already there!). ๐Ÿ‘

tata9001 commented 3 years ago

Waiting for this!!!

daniel-baptista-travcorp commented 3 years ago

:+1: Yes please

git-pchauhan commented 3 years ago

Totally โž• 1๏ธโƒฃ

azamsiddiqi3791 commented 3 years ago

Yes please

shashankvs01 commented 3 years ago

Please prioritize !!

sbkg0002 commented 3 years ago

Any news on this?

tata9001 commented 3 years ago

Please, our billing is crying!!!

qihonggang commented 3 years ago

waiting for this! please prioritize!

sbkg0002 commented 2 years ago

Any update on this? We're Working On It since half a year :)

snay2 commented 2 years ago

I opened a new issue to publish metrics for a similar usecase, in case it's useful to anyone here: https://github.com/aws/containers-roadmap/issues/1587

wayne-folkes commented 2 years ago

Adding my voice to the chorus on this one. My team is pushing multi-arch images. Because of this, images are shown as untagged. Having a policy that simply deletes untagged images would be dangerous as I have no way if it is safe to delete. If I knew the image has not been pulled in the last N days would give us some confidence that we are deleting unused resources.

sbkg0002 commented 2 years ago

@arunsollet what happend? The metrics seem to be there now!

snay2 commented 2 years ago

@sbkg0002 I see a metric for RepositoryPullCount in the docs for ECR private (released in January 2022), but not one that describes how recently an image was pulled. Can you give more detail of what you're seeing?

ivan-moto commented 1 year ago

Hey, any updates on this?

volk1234 commented 1 year ago

Anybody looking into this ???

jdkealy commented 1 year ago

Any updates on this ?

jlbutler commented 1 year ago

Hi all. We are tracking a lastRecordedPullTime but have not yet done work to integrate it into LCP. One concern we have is that while it's one piece of data, it doesn't necessarily indicate an image is safe to expire if it hasn't been pulled in some amount of time.

We were doing some work on a method to track whether or not a particular image is specified in a current deployment specification. As you may suspect that is a large bit of work and will take time to fully understand.

Given the upvotes on this issue and interest, we will pull it into consideration for our current round of planning. Thanks for the continued interest and input!

ivanychev commented 1 year ago

Any progress on this?

maherrj commented 1 year ago

100% second this. We have thousands of images across hundreds of repositories. We provide the service to our production consumers. We need to remove images but careful as to not cause an outage where an image is in use.

We had to develop a custom solution to tag the images based on CloudTrail events. Pretty horrible workaround.

24601 commented 1 year ago

100% second this. We have thousands of images across hundreds of repositories. We provide the service to our production consumers. We need to remove images but careful as to not cause an outage where an image is in use.

We had to develop a custom solution to tag the images based on CloudTrail events. Pretty horrible workaround.

Us, too. But expecting AWS to do anything that saves their customers money is, well, not something I am holding my breath for.

aviau commented 1 year ago

@jlbutler

One concern we have is that while it's one piece of data, it doesn't necessarily indicate an image is safe to expire if it hasn't been pulled in some amount of time.

For us that wouldn't be an issue as we pull often. It covers many use cases so why not just release it while you build whatever more advanced feature you want to build?

We were doing some work on a method to track whether or not a particular image is specified in a current deployment specification.

That won't work for many use cases because not everyone that uses ECR has deployments inside AWS.

jlbutler commented 1 year ago

Hi @aviau

For us that wouldn't be an issue as we pull often. It covers many use cases so why not just release it while you build whatever more advanced feature you want to build?

For sure. I was just calling out a concern that I continue to have about leveraging this value on its own to indicate that an image is not in use. But as I indicated, we definitely are planning some work to improve the usefulness of the attribute (it currently tracks manifest pulls, which can make things confusing if you really want to know if the image was pulled including its layers), and we're looking at integrating this into LCP. We are still working on our roadmap, but we'll share more when there's something concrete.

That won't work for many use cases because not everyone that uses ECR has deployments inside AWS.

Yep it gets a bit tricky. Like you said, not everyone uses ECR on AWS, and not everyone using ECR deploys on AWS. The work I referred to may not be ECR-specific, potentially working for any image digest. Part of that could be opt-in solutions (e.g. a Kubernetes controller you can install to report image use). I don't think we can serve all use cases, but we're doing some research around that now, and don't have any formal features planned with it yet.

We'll post back here when we have a more concrete timeline for this LCP request, thanks again!

mgarber-ops commented 1 year ago

I can see this being a nice to have but I'd be careful in situations where underlying EKS nodes are caching container images for respective workloads

jobimrobinsantos-drizly commented 1 year ago

This feature would be extremely handy for my organization. In particular, we would like to implement this type of lifecycle on our pull-through-cache repositories.

prashil-g commented 1 year ago

this is very important feature to have. any update if anyone is looking into it?

volk1234 commented 1 year ago

Well, I wait this for years but I believe that features that really helps to save costs are not in priority at all :)

volk1234 commented 1 year ago

@jlbutler Any updates about research you'd mentioned?

seabyrn commented 10 months ago

Hi all. We are tracking a lastRecordedPullTime but have not yet done work to integrate it into LCP. One concern we have is that while it's one piece of data, it doesn't necessarily indicate an image is safe to expire if it hasn't been pulled in some amount of time.

Doesn't "sinceImagePushed" (which is available in LCPs) suffer from the same shortcoming?

abhishekkundalia commented 10 months ago

Any plans to push this or any alternative hacks to achieve this?

blowfishpro commented 10 months ago

any alternative hacks to achieve this?

It's possible to get the last pull time via API calls and then explicitly delete those images. This could be done e.g. by a lambda that runs periodically.

nicc777 commented 4 months ago

Almost 4 years, for what I would have thought must be an obvious option to have. Can we perhaps get some kind of an update on this? @jlbutler ?

Also, the comment "One concern we have is that while it's one piece of data, it doesn't necessarily indicate an image is safe to expire if it hasn't been pulled in some amount of time." does not make sense, since I would argue sinceImagePushed leaves you with the same issue, if not even worse.

I am busy now implementing this on our side with a whole lot of logic just to refresh an image before it expires with the standard sinceImagePushed option. Having this option will hugely simplify our container management.

sammcj commented 3 months ago

I'm seeing clients getting impacted by not being able to check if an image is in use by ECS, came looking to see how to do this and found this issue. It is badly needed.

sbkg0002 commented 3 months ago

Until someone writes a blogpost that goes viral, which describes the costs involved, nothing will happen apparently.

Hronom commented 3 weeks ago

AWS how about this proposal:

Step 1

Introduce last-pull-datetime per image like in DockerHub: https://docs.docker.com/docker-hub/api/latest/#tag/repositories/paths/~1v2~1namespaces~1%7Bnamespace%7D~1repositories~1%7Brepository%7D~1tags~1%7Btag%7D/get

image

Step 2

Modify Match criteria and add there next options:

  1. Since image last pulled, use days there same as for Since image pushed

Step 3

Add to Lifecycle Policies the possibility to select Sorting criteria after Match criteria here: image

Inside Sorting criteria make several options:

  1. Unspecified (Default)
  2. By image push date
  3. By image last pull date

Frontend logic:

  1. If in Match criteria user selects Since image pushed - gray out Sorting criteria or not show them
  2. If in Match criteria user selects Since image last pulled - gray out Sorting criteria or not show them
  3. If in Match criteria user selects Image count more then - show Sorting criteria

Backend logic:

  1. If in Match criteria user selects Since image pushed - do what you do right now
  2. If in Match criteria user selects Since image last pulled - expire all images that have last pulled date above mentioned days
  3. If in Match criteria user selects Image count more then - apply selected Sorting criteria first and then expire docker images that are beyond the mentioned limit for Image count more then

Result

User be able to apply simple expiration by image pull date, based on Since image last pulled from Match criteria.

If users wanna have protection, they can leverage Match criteria in combination with Sorting criteria and set something like this: retain 10 of the most recent docker images. This helps to avoid issue if you have docker images that are not frequently pulled.