gardener / gardener-extension-registry-cache

Gardener extension controller which deploys pull-through caches for container registries.
Apache License 2.0
7 stars 20 forks source link

Configure registry cache in the shoot cluster (RegistryConfig and secret) #244

Open pbochynski opened 2 months ago

pbochynski commented 2 months ago

How to categorize this issue?

/area usability /kind enhancement

What would you like to be added: Possibility to use RegistryConfig and secret from shoot cluster. It can be done similarly to DNSProviders like:

Kind: Shoot
...
spec:
  extensions:
    - type: registry-cache
      providerConfig:
        apiVersion: registry.extensions.gardener.cloud/v1alpha3
        kind: RegistryConfig
        registryConfigReplication:
          enabled: true
...

With replication enabled the RegistryConfig custom resource definition would be created in the shoot cluster and user could create such configs directly in the shoot.

Why is this needed: Kyma customers do not have access to the control plane (garden project), they create resources in shoot cluster. We could introduce some mechanism to replicate the registry cache configuration from the shoot cluster to the garden project but it doesn't make sense, as the registry cache is running in the shoot cluster. We could keep credentials and configuration only in the shoot cluster and simplify our solution.

ialidzhikov commented 1 month ago

Hi @pbochynski,

I think there are two feature requests in this issue:

  1. Enable deployment of a registry cache via custom resource in the Shoot cluster.

For a context, I think you have been using so far globally enabled extensions such as shoot-dns-service and shoot-cert-service. There are 2 ways how an extension can be enabled for a Shoot. The first way is the extension to be globally enabled by Gardener administrators (in the ControllerRegistration). A globally enabled extension is enabled for all Shoots on the corresponding landscape. The second way is an extension to be enabled via the Shoot spec, see example. While extensions like shoot-dns-service and shoot-cert-service are suitable for and can be globally enabled (you don't have to do anything in the Shoot spec to enable them for the Shoot), there are extensions out there which are not suitable for and are not globally enabled. If you build a machinery on top of Gardener, it is not a correct expectation every extension to be globally enabled and nothing to be required to be configured in the Shoot spec. This is not the case for extensions like [registry-cache] and shoot-rsyslog-relp.

The second point I wanted to raise is that the registry-cache extension configuration is coupled to the Shoot cluster and makes sense there. For the shoot-dns-service and shoot-cert-service extensions you don't specify what kind of DNS records or certificates you want in the Shoot spec. It does not make much sense as usually DNS record or certificate is coupled to a Service/Ingress in the cluster, and not to the cluster itself. That's why the authors of these extensions choose to have CRDs like DNSEntry and Certificate available in the Shoot spec. You don't know usually on Shoot creation what kind of DNSEntrys and Certificates you need. The registry-cache configuration is in the Shoot spec because it is a configuration related to the Shoot cluster (whether you want to cache images, which upstreams you want to cache, etc.). It is also coupled with the Shoot lifecycle. Enablement of a registry-cache does not consist of any deploying a StatefulSet and other K8s resources. The extension makes sure to configure the containerd on the cluster Nodes to make sure that containerd uses the deployed in-cluster registry cache.

tl;dr: I think it is about wrong expectation how extensions are usually deployed and consumed. Additionally, a consumption model based on custom resources does not fit the conceptual model of the registry-cache extension which is coupled to the Shoot cluster lifecycle (and not to a concept like Service/Ingress).

  1. Allow Secrets for the private registries to be specified in the Shoot cluster, not in the Garden cluster.

You might be also wondering why need to specify such credentials for the registry-cache when I already provide these credentials in form of image pull Secrets.

The short answer: This is how it is implemented in the upstream. See https://github.com/distribution/distribution/issues/4281.

The long answer: The used upstream implementation of pull through cache (the Distribution project) supports only configuring single set of credentials per upstream. The pull through cache does not respect the authentication information provided by containerd but requires single set of credentials to be configured for it and it uses these credentials when it does requests to the upstream. If the pull through cache provided to the upstream the authentication info it receives in the request, it wouldn't be needed to configure these additional credentials in the Garden cluster.

We had this request also raised in the past: "Why I have to specify the image pull Secret in the Garden cluster again while I already have it specified in the Shoot cluster?". Tecnically, it should be possible to also specify this Secret in the Shoot cluster as well. However, it is again not possible to eliminate having 2 Secrets - 1 image pull Secret for the Pod (to make sure that containerd can pull the image from the upstream in case the cache is not available) and 1 Secret for registry-cache (to make sure the cache can pull the images from the upstream). I don't know if it improves much the user experience if we require the Secret to be specified in the Shoot cluster instead of in the Garden cluster.