Open mmerickel opened 3 years ago
FWIW the recommendation from AWS support at this time is to define a launch template on the managed node group and write /etc/docker/daemon.json
yourself with the config you need. Of course EKS also writes its own content to that file so you'll have to take that into account as well.
Note that the newer EKS clusters make use of containerd
, thus you'd need to make the changes elsewhere:
a) Add appropriate content to /etc/containerd/certs.d/docker.io/hosts.toml
referring registry-1.docker.io to the registry mirror
b) if there is auth, add appropriate imagepullsecrets references to every namespace (or to every pod individually)
See https://github.com/containerd/containerd/blob/main/docs/hosts.md
This also needs to be configurable for EKS Fargate.
There's various common reasons for wanting a kubernetes cluster to use a cache or mirror for container images, particularly for performance (launch pods on new hosts faster while scaling), security (embargo upstream changes for vetting), and accomodating upstream usage limits. It is advantageous if this can be configured transparently (e.g., via containerd settings) instead of needing to individually customise every pod-spec in the entire infra-code-base to explicitly refer to the local registry mirror (for example, so that the image cache solution could re-engineered independently from application deployments, and applications can be deployed with less customisation into multiple environments such as clusters in different accounts/regions with different private ECR pull-through-cache addresses). It's currently more difficult to achieve such transparency in clusters mixing both EC2 nodes and Fargate. (A mutating admission controller is another option but requires far more complexity to set up. Kustomisation may be an alternative in some cases but is less convenient if deploying Helm charts.)
Community Note
Tell us about your request What do you want us to build?
Add builtin support or documentation for defining registry mirrors for managed node groups.
Which service(s) is this request for? This could be Fargate, ECS, EKS, ECR
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
The default for most public helm charts is to source images from docker hub (docker.io), and for experimentation we'd like to avoid putting too much process on it. However, syncing directly from docker hub without using authentication triggers rate limits very quickly.
Are you currently working around this issue? How are you currently solving this problem?
The rate limiting issues with docker hub are a huge problem for organizations and the current standard approaches are to:
Additional context Anything else we should know?
We'd really like to have integrated support between the coming ECR pull-through caching support https://github.com/aws/containers-roadmap/issues/939 and managed node groups to automatically authenticate with one that the cluster can use.
Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)