argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
16.77k stars 5.08k forks source link

Allow multiple clusters pointing to the same URL #15027

Open alexymantha opened 11 months ago

alexymantha commented 11 months ago

See original discussion highlighting the issue below

Summary

It is currently impossible to define multiple clusters pointing to the same URL, even if they have different name.

Motivation

Similar to what is discussed in the discussion below, I was trying to set up multiple clusters pointing to the same URL but managing different namespaces. This did not work because internally ArgoCD refers to clusters by its URL. This caused ArgoCD to take one of the clusters, put it into cache and use it for the following operations even if the resources were on clusters with different names.

Example: 2 clusters are configured:

Proposal

Instead of using the server URL to refer to a cluster, it should use a pair of the name and server URL. This would make it possible to differentiate between clusters using the same server URL.

Discussed in https://github.com/argoproj/argo-cd/discussions/9388

Originally posted by **mFranz82** May 12, 2022 I am working in a [Rancher](https://rancher.com) environment where a DEV team belongs to a [Rancher project](https://rancher.com/docs/rancher/v2.5/en/cluster-admin/projects-and-namespaces/) with corresponding rights within the cluster. As ArgoCD does not provide the option to specify service accounts on project or application level we thought we could wrap a cluster around each project providing a cluster scoped service account. Something like a virtual cluster per team pointing to the same k8s api: **dev-cluster-team-a** (api url) > project > application > Sync actions using SA **argcd-manager-team-a** **dev-cluster-team-b** (api url) > project > application > Sync actions using SA **argcd-manager-team-b** When starting with the implementation we quickly realised that the **API url** is used by ArgoCD to identify the cluster which of course won't work in our setup. We simply can not create a ArgoCD cluster pointing to the same api twice. Do you think this is intentional? Are there any concept considerations which I missed? Update: We found a simple solution: Simple create a Service (External Name) per cluster pointing to the same api.
blakepettersson commented 11 months ago

14255 should cover your use case (i.e impersonation will be the path which Argo will use in the near future)

blakepettersson commented 11 months ago

Think this is a duplicate of #10897

alexymantha commented 11 months ago

Correct me if I'm wrong but it looks like #14255 would cover the service account use-case, but it still does not differentiate between clusters with the same URL and we do have use cases for that as well. We would like to be able to have multiple "virtual" clusters with different properties pointing to the same real cluster as mentioned by @agaudreault-jive in the linked discussion.

I think the main issue is that the current setup is somewhat unintuitive, I think it is a fair assumption that if clusters have a name set, they should be differentiated by their name and not by their URL. ArgoCD also allows it without any warnings and it breaks in unexpected ways. As described in the issue, operations will be done on random clusters and the cluster settings will be completely broken since they also use the URL only.

While the ideal solution IMO would be to support using the name, if it is not possible I think it should at least prevent using the same URL to avoid unexpected behaviours.

alexymantha commented 11 months ago

It is the same subject as https://github.com/argoproj/argo-cd/pull/10897 but I couldn't find an issue highlighting this problem so I figured discussions would be easier to track as an issue rather than under a PR with a specific implementation.

agaudreault commented 11 months ago

I was reading https://github.com/argoproj/argo-cd/pull/14255 and while impersonation would be a great feature, I also think it is slightly unrelated. I see this issue as a change to how the control plane credential is used to maintain a cache, while impersonation is about how an app is synced.

I don't think the server url can be used as a primary key anymore now that we can provide namespace and clusterResource on a cluster. I think the name+server should be used instead, but this change probably extend to the gitops engine.

If impersonation is available, this issue becomes a bit unnecessary for permissions. However , it must have validation to restrict multiple clusters with the same URL and documentation can be written for the workaround to create a different cname, or a k8s service for the local cluster.

few options I am thinking of

Option 1: Add validation without impersonation

Option 2: Add name+server support

Option 3: Add validation with impersonation

csantanapr commented 9 months ago

I would like to have this implemented, I think it would be useful to be able to provide a way to help shard large of apps and be able to use multiple argocd controllers for the same cluster

reegnz commented 7 months ago

I think I'd prefer if I could still impersonate on the cluster config level. The config resembling kubeconfig would go a long way in matching user expectations. That means referring to the same cluster using multiple different contexts, defining namespace per context, defining user per context (eg exec, impersonate, whatever kubeconfig supports.

IMHO reinventing a config format that the kubernetes client already provides with kubeconfig is doing existing k8s users a disservice, causing unnecessary friction between the various tools.

Please start standardizing on the kubeconfig format for cluster connectivity.

awx-fuyuanchu commented 4 months ago

How about creating several CNAME records for the API Server and using different hostnames for the Applications in different namespaces? We plan to use this as a workaround to shard the applications on one cluster.

alexymantha commented 4 months ago

How about creating several CNAME records for the API Server and using different hostnames for the Applications in different namespaces? We plan to use this as a workaround to shard the applications on one cluster.

That's how we solved it currently, but having to create a CNAME for every managed namespace in a cluster is a hassle, it would be a better experience if we could do it without this kind of workaround.

michel-numan commented 1 month ago

See original discussion highlighting the issue below

Summary

It is currently impossible to define multiple clusters pointing to the same URL, even if they have different name.

Motivation

Similar to what is discussed in the discussion below, I was trying to set up multiple clusters pointing to the same URL but managing different namespaces. This did not work because internally ArgoCD refers to clusters by its URL. This caused ArgoCD to take one of the clusters, put it into cache and use it for the following operations even if the resources were on clusters with different names.

Example: 2 clusters are configured:

* dev, configured to manage namespace dev with URL http://kubernetes.default.svc

* staging, configured to manage namespace staging http://kubernetes.default.svc
  And two applications, one targeting dev:
destination:
  namespace: dev
    name: dev

And one targeting staging:

destination:
  namespace: staging
    name: staging

When the first reconciliation of dev happens, ArgoCD chooses one cluster (seemingly randomly) and puts in into the cluster cache. If it chooses staging, then the operation will fail with Failed to load live state: Namespace "dev" for AppProject "x" is not managed since the staging cluster does not manage the dev namespace.

Proposal

Instead of using the server URL to refer to a cluster, it should use a pair of the name and server URL. This would make it possible to differentiate between clusters using the same server URL.

Discussed in #9388

Originally posted by mFranz82 May 12, 2022 I am working in a Rancher environment where a DEV team belongs to a Rancher project with corresponding rights within the cluster. As ArgoCD does not provide the option to specify service accounts on project or application level we thought we could wrap a cluster around each project providing a cluster scoped service account. Something like a virtual cluster per team pointing to the same k8s api:

dev-cluster-team-a (api url) > project > application > Sync actions using SA argcd-manager-team-a dev-cluster-team-b (api url) > project > application > Sync actions using SA argcd-manager-team-b

When starting with the implementation we quickly realised that the API url is used by ArgoCD to identify the cluster which of course won't work in our setup. We simply can not create a ArgoCD cluster pointing to the same api twice.

Do you think this is intentional? Are there any concept considerations which I missed?

Update:

We found a simple solution:

Simple create a Service (External Name) per cluster pointing to the same api.

@alexymantha Can you share the ExternalName service you created as we've done this and still couldn't get it working

reegnz commented 1 month ago

I tried messing around with ExternalName, used this:

---
apiVersion: v1
kind: Service
metadata:
  name: external-dns-in-cluster
  namespace: argocd
spec:
  type: ExternalName
  externalName: kubernetes.default.svc.cluster.local
  ports:
  - port: 443
---
apiVersion: v1
kind: Secret
metadata:
  name: cluster-external-dns
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: external-dns-in-cluster
  server: https://external-dns-in-cluster.argocd.svc

But the app fails on the certificate validation because the Subject Alternative Name (SAN) doesn't contain my DNS CNAME alias. So DNS CNAME is out of the question unless you can control the cert SAN-s.

My error message (I'm running on AWS EKS):

Get "https://external-dns-in-cluster.argocd.svc/version?timeout=32s": tls: failed to verify certificate: x509: certificate is valid for <redacted>.us-west-2.eks.amazonaws.com, ip-<redacted>.us-west-2.compute.internal, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not external-dns-in-cluster.argocd.svc

I doubt the solution is a simple ExternalName (or any other DNS CNAME), as the above SAN issue would still be present.

BTW a different workaround mentioned in #2288 involving a URL query parameter worked for me:

https://kubernetes.default.svc?__scope=external-dns-in-cluster
andrewjeffree commented 1 month ago

Hi @reegnz did you also encounter the "Failed to load live state: Namespace "y" for AppProject "x" is not managed" issue in addition? As I'm doing the same thing as you with EKS wherein I'm using a URL query parameter but I'm getting that error and am going crazy trying to figure out what I'm missing as everything else looks fine 😰

reegnz commented 1 month ago

Nope, I gave up once I ran into the SAN issue, as there's no way to work around that one.