bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.75k stars 516 forks source link

Bottlerocket ignoring settings.container-registry.mirrors #3631

Open maxtacu opened 11 months ago

maxtacu commented 11 months ago

It seems like bottlerocket is ignoring settings.container-registry.mirrors We have configured Pull through cache in aws ECR and set the settings.container-registry.mirrors for quay.io, docker.io, ghcr.io and registry.k8s.io to use ECR Pull through cache, but bottlerocket instance is still pulling directly from the upstream. I can clearly see that mirrors were set using apiclient -u /settings on the instance. It is not even trying to pull from ECR pull-through cache. Our current setting is smth like this:

  [[settings.container-registry.mirrors]]
  "registry" = "registry.k8s.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/k8s"]
  [[settings.container-registry.mirrors]]
  "registry" = "quay.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/quay"]
  [[settings.container-registry.mirrors]]
  "registry" = "docker.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/dockerhub"]
  [[settings.container-registry.mirrors]]
  "registry" = "ghcr.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/github"] 

Image I'm using: bottlerocket-aws-k8s-1.27-x86_64-v1.16.1

What I expected to happen: Pull through ECR cache for images in quay.io, docker.io, ghcr.io and registry.k8s.io registires.

What actually happened: Pulling directly from upstream. Not even trying to pull from the ECR cache

How to reproduce the problem:

ecpullen commented 11 months ago

Thank's for reaching out. We are looking into this issue.

etungsten commented 11 months ago

Hi, the configuration specified in settings.container-registery gets passed to docker/containerd/kubelet's configuration for configuring registry mirrors and credentials as is. It's likely that you would need to specify creds to be able to talk to the private ECR repositories. For getting the credentials, you can follow instructions here: https://docs.aws.amazon.com/AmazonECR/latest/userguide/registry_auth.html and specify them via settings.container-registery.credentials.

Do note that there is a potential security concern for docker-based variants like aws-ecs-* when specifying auth for your mirror outline here: https://github.com/moby/moby/issues/30880#issuecomment-798807332 where your credentials might end up getting send to the destination registry if your mirror is not responding as the image resolver tries all possible endpoints.

maxtacu commented 11 months ago

@etungsten isn't it enough having a IAM role for the instance to access private ECR? If I specify on the instance to pull an image from the ECR it will pull it, but 'mirroring' is requiring credentials? why so?

etungsten commented 11 months ago

Hi @maxtacu, in the case of using a private ECR image directly with K8s pods, kubelet can get the ECR credentials from the AWS cloud provider (specifically the ECR credentials helper). However, that code path does not get triggered if the destination image URL is not ECR, and you set ECR as a private mirror instead. kubelet would only see that you're trying to pull from quay.io/docker.io/ghcr.io. In this case, after kubelet tells cri-containerd to pull the image, cri-containerd will need auth information for talking to the registry endpoints that you set as mirror.

For more details, you can check out this containerd issue: https://github.com/containerd/containerd/issues/6637

I think for your use-case it might be easier to try setting up private ECR as pull through caches as detailed here: https://docs.aws.amazon.com/AmazonECR/latest/userguide/pull-through-cache.html. You then wouldn't need to set private ECR as a registry mirror, but instead would need to set private ECR as the actual image URLs for your pods. kubelet would then be able to get the ECR credentials through the AWS cloud provider. Though I understand it might be troublesome to modify your K8s deployments to replace all of the images URIs you're trying to mirror/cache.

maxtacu commented 11 months ago

weird, but it looks like it randomly started today trying to pull through ECR mirror and now we have the same issue as described here https://github.com/bottlerocket-os/bottlerocket/issues/2427 I will test with credentials later. But it seems that it is an issue with bottlerocket (or containerd itself) to be able to properly use IAM roles instead of credentials only

tuananh commented 7 months ago

has anyone found a workaround for this yet?

svyatoslavmo commented 7 months ago

Kyverno or another mutation webhook

maxtacu commented 7 months ago

@tuananh within the bottlerocket I didnt find any workaround so eventually I created a mutating webhook. Repository also includes kyverno policies in case you want to do it only via kyverno

tuananh commented 7 months ago

@tuananh within the bottlerocket I didnt find any workaround so eventually I created a mutating webhook. Repository also includes kyverno policies in case you want to do it only via kyverno

edit: looks like this is how it works

api => kubelet => check registry , see if we need to run generate creds via CredentialProviderConfig

tuananh commented 2 weeks ago

friendly ping. is there any update to this?