aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.2k stars 315 forks source link

[ecr] [request]: cache layers between repositories #531

Open jespersoderlund opened 4 years ago

jespersoderlund commented 4 years ago

Tell us about your request Currently ECR doesn't cache image layers between repositories and with ECRs model of creating a repo per image this leads to quite poor performance, especially in situations where there are many images being built on a common base-image.

Each image would be a separate upload taking the full volume of the image taking significantly longer to push/replicate. This becomes an issue with a globally distributed architecture where 100s of services built on a common base-image needs to be replicated to remote regions.

In most other docker repositories the model is different with a single repository serving multiple images.

Which service(s) is this request for? ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We want to have quick push/replication of images used for micro services built on a common base-image (which becomes layers in the service-specific images) across regions.

Are you currently working around this issue? Trying to parallelise request as much as possible, but this is causing unnecessary cost and still wont be as quick as if the shared layers could be reused across repos.

Additional context

Attachments

jswetzen commented 4 years ago

This is a real pain point for us. Current workaround is to separate images using tags instead of repositories, but it's ugly! Docker Hub an Artifactory solves this without problem and stops unnecessary uploading if the layer already exists.

richstokes commented 3 years ago

This would save a lot of bandwidth too, not having to push/pull the same base layers every time..

slimshreydy commented 3 years ago

Any news on this? Would love to be able to use this.

tjb801 commented 3 years ago

In my opinion, ECR is not usable without this feature.

lkishalmi commented 2 years ago

Come on guys! GCP has this!

hipstern commented 1 year ago

Any updates on this?

sergiy-kozak commented 1 year ago

Seriously, how ECR can be still missing such an essential feature in 2022?

jessefarinacci commented 1 year ago

Seriously, how ECR can be still missing such an essential feature in 2022?

I'm trying not to be cynical about it, as AWS does benefit financially from this inefficiency, but when will this get accepted to the upcoming delivery schedule? We just passed this issue's 3rd anniversary and we're rapidly approaching ECR's 7th anniversary, and yet this basic level of layer caching is still missing.

Is there some policy based security needed which is causing complex inter- and intra-account cross repository access delaying this? Cross region replication woe? Any of those types of problems would be understandable. Official information or direction about this would be very helpful at this point.. Thanks!

four43 commented 1 year ago

Hello from 2023 :wave:

Layers and caches are essential to containers to save storage, bandwidth, cost, and time. This has been a critical component of container architecture since the beginning.

matthenry87 commented 1 year ago

So far I haven't been able to determine that they're using actual disk space usage versus just multiplying the # of images by their individual sizes (without taking layer re-use into account within individual repositories).

This is so long overdue.. don't charge us multiple times to store the same layers within 1 registry, especially for the base images.

hfawaz commented 10 months ago

We are ignoring this @aws ?

BroMattMiller commented 8 months ago

Just opened a Support ticket on this, and was directed here to +1. So, +1.

blowfishpro commented 8 months ago

So far I haven't been able to determine that they're using actual disk space usage versus just multiplying the # of images by their individual sizes

You could inspect the response when you attempt to download the blobs (or invoke the GetDownloadUrlForLayer API directly). It returns a redirect to an S3 presigned URL, you could see if it resolves to the same S3 object.

systematicguy commented 8 months ago

Due to possibility to individually encrypt repos using different keys, I am 99% sure they really don't share images. I believe this repo-level granularity is the major blocker to cross-repo layer-sharing, btw.

four43 commented 8 months ago

KMS keys are configured per-repository.

I believe many of us are hoping for sharing of layers between images within the same repository - a core feature of docker and seemingly a cheap money grab if not actually implemented that way.

systematicguy commented 8 months ago

Beware of the terminology. I was coming from the Artifactory world and was confused for a long time until I realized repo, registry is not the same in ecr.

KMS keys can be set up per ecr repo. All the private ecr repos make up your per-account single private ecr registry.

This is not the same as e.g. in artifactory, where you can have multiple registries where layer-sharing is inherent.

In ecr you proactively have to configure each repo (tag immutability, kms key, etc), whereas in artifactory you just push whatever image name you push to a registry.

systematicguy commented 8 months ago

My point is: say you have 50 ecr repos where you push ubuntu-based images. You will need to store the base image 50 times! Yes, within one repo the layer will be reused but not across the others, due to the exposed possibility of separate configuration, IAM access, etc.

I don't say this is necessary and good, just share my understanding. I would be also happy to drop this border between repos in favor of multiple registries and layer reuse within one.

four43 commented 8 months ago

I understand what you're saying. I agree that your use case is an even broader and may not be possible due to those limitations you explained. The core of this issue is layer re-use between repository. Which is a step before cross-repository layer re-use, IMO.

Terms used are AWS terms, since this is an AWS issue.

dene14 commented 3 months ago

Hello from 2023 šŸ‘‹

Layers and caches are essential to containers to save storage, bandwidth, cost, and time. This has been a critical component of container architecture since the beginning.

Hello from 2024 šŸ‘‹ Time lapsing, AWS continuing to make money from its own inefficiency.

golosegor commented 1 month ago

how can I vote for this?

emalihin commented 2 weeks ago

Is the answer here really what's going on? If so I'm out of luck trying to copy ECR image across accounts/repos efficiently due to this