argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.44k stars 5.3k forks source link

Unable to add 1.24.0 Kubernetes cluster #9422

Closed waltforme closed 2 years ago

waltforme commented 2 years ago

Checklist:

Describe the bug

When I tried to add a freshly created v1.24.0 Kubernetes cluster to argocd, I got timeout (see the Logs for details). The cluster can't be added.

Then I created a fresh v1.23.6 cluster, I can add it successfully.

I'm using kubeadm to create my Kubernetes clusters. The only difference between the two creation is one single parameter passed to kubeadm init, which is --kubernetes-version.

Version

argocd: v2.3.3+07ac038
  BuildDate: 2022-03-30T01:46:59Z
  GitCommit: 07ac038a8f97a93b401e824550f0505400a8c84e
  GitTreeState: clean
  GoVersion: go1.17.6
  Compiler: gc
  Platform: linux/amd64
argocd-server: v2.3.3+07ac03

Logs

INFO[0001] ServiceAccount "argocd-manager" already exists in namespace "kube-system" 
INFO[0001] ClusterRole "argocd-manager-role" updated    
INFO[0002] ClusterRoleBinding "argocd-manager-role-binding" updated 
FATA[0032] Failed to wait for service account secret: timed out waiting for the condition
danielhelfand commented 2 years ago

This may be related to changes for serviceaccount token in 1.24:

The error is coming from here in code: https://github.com/argoproj/argo-cd/blob/8cd7d470e89212b085c03462c042925a1f52d3f2/util/clusterauth/clusterauth.go#L244

rishabh625 commented 2 years ago

So to support 1.24 @danielhelfand : Do you think that argocd to create a secret and then annotate with argocd-manager service account? like here

I guess not.

waltforme commented 2 years ago

@danielhelfand What I tried on my two clusters supports your comment:

On the 1.24.0 cluster:

root@ip-192-168-1-38:~# kc version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.24.0
Kustomize Version: v4.5.4
Server Version: v1.24.0
root@ip-192-168-1-38:~# kc get sa -n kube-system | grep argo
argocd-manager                       0         5d1h
root@ip-192-168-1-38:~# kc get secret -n kube-system | grep argo
No resources found in kube-system namespace.

On the 1.23.6 cluster:

root@ip-172-31-31-208:~# kc version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.24.0
Kustomize Version: v4.5.4
Server Version: v1.23.6
root@ip-172-31-31-208:~# kc get sa -n kube-system | grep argo
argocd-manager                       1         47h
root@ip-172-31-31-208:~# kc get secret -n kube-system | grep argo
argocd-manager-token-zsp69                       kubernetes.io/service-account-token   3      47h
root@ip-172-31-31-208:~#
danielhelfand commented 2 years ago

So to support 1.24 @danielhelfand : Do you think that argocd to create a secret and then annotate with argocd-manager service account? like here

I guess not.

I think the goal should be to support the TokenRequest API, but that will be a bigger change.

The short term solution may be to have the token controller populate a secret created when adding clusters. I haven't weighed the pros/cons of this quite yet.

waltforme commented 2 years ago

Just want to share my (hacky) work around on this.

  1. Create a service account token Secret in the kube-system namespace, making sure that the annotation refers to the argocd-manager service account;

    apiVersion: v1
    kind: Secret
    metadata:
    annotations:
    kubernetes.io/service-account.name: argocd-manager
    name: argocd-manager-token
    namespace: kube-system
    type: kubernetes.io/service-account-token
  2. Yes, kubernetes 1.24 populates data into the newly created secret;

  3. But the secret is not associated with the sa, the sa still has 0 secrets;

    root@ip-192-168-1-38:~# kubectl get sa -n kube-system
    NAME                                 SECRETS   AGE
    argocd-manager                       0         5d4h
  4. I did kubectl edit sa -n kube-system argocd-manager to manually add the secret to the service account:

    secrets:
    - name: argocd-manager-token
  5. Now the service account has 1 secret;

  6. And I can add the 1.24.0 cluster now.

    root@ip-172-31-55-65:~# argocd cluster add --kubeconfig ./config_kyst_us-west-1 kyst-backend-us-west-1
    WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `kyst-backend-us-west-1` with full cluster level admin privileges. Do you want to continue [y/N]? y
    INFO[0002] ServiceAccount "argocd-manager" already exists in namespace "kube-system" 
    INFO[0002] ClusterRole "argocd-manager-role" updated    
    INFO[0002] ClusterRoleBinding "argocd-manager-role-binding" updated 
    FATA[0032] Failed to wait for service account secret: timed out waiting for the condition 
    root@ip-172-31-55-65:~# argocd cluster add --kubeconfig ./config_kyst_us-west-1 kyst-backend-us-west-1
    WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `kyst-backend-us-west-1` with full cluster level admin privileges. Do you want to continue [y/N]? y
    INFO[0001] ServiceAccount "argocd-manager" already exists in namespace "kube-system" 
    INFO[0001] ClusterRole "argocd-manager-role" updated    
    INFO[0001] ClusterRoleBinding "argocd-manager-role-binding" updated 
    Cluster 'https://<hide-my-ip-here>:6443' added

With that, to fix this by the 'short term solution', we may need to not only create a service account token Secret, but also add the secret to the argocd-manager service account.

danielhelfand commented 2 years ago

In reviewing this a bit further, I am wondering if the clusterauth package's GetBearerToken function can be altered to always create the secret, bind it to the serviceaccount, and wait for the token controller to appropriately populate the secret. The downside of this is creating an additional token in previous k8s versions.

danielhelfand commented 2 years ago

It turns out the TokenRequest API is pretty straight forward to use. Here's a hacky WIP commit to show what it looks like. I have tried both approaches (creating secret and using token request API), and the TokenRequest API seems to resolve the issue. Still need to work through the best approach for maintaining backwards compatibility with the Secret approach for older versions of k8s.

danielhelfand commented 2 years ago

Adding in a research doc I have been throwing together.

rishabh625 commented 2 years ago

Also i was wondering, that other service account eg: argocd-application-controller will not have secret mounted, so how argocd controller is able to do api call to kube-api server without token and ca.crt

abdennour commented 1 year ago

I do confirm . Issue appeared in OCP 4.11 which is based on kubernetes 1.24 I would say this is a bug in kubernetes. Because I can see this behavior is broken with prometheus in openshift . i.e. oc sa get-token prometheus-k8s -n openshift-monitoring did not work as well.

So this means that parsing the token of an SA has been changed since k8s 1.24

resolution

ns=kube-system
sa_token=$(kubectl -n $ns get secret | grep argocd-manager-token | awk '{print $1}')
kubectl -n $ns patch sa argocd-manager -p '{"secrets": [{"name": "'"${sa_token}"'"}]}'
# then run "argocd cluster add" command again
vedarths commented 1 year ago

Hello! Please could someone let me know what version of Argocd we are looking at with the fix for this issue?

I am presuming the only work around for now is to go with what @waltforme has suggested above?

Thanks!

raj13aug commented 1 year ago

Currently we are using argocd 2.3.1 and are facing same issue.

Which version of Argocd is suitable to add and register 1.24 Kubernetes cluster?

Thanks

doyelese commented 1 year ago

+1 on this, has anyone fixed this?

crenshaw-dev commented 1 year ago

The fix was released in 2.3.7 and 2.4.0 onward.

piotrmarczydlo commented 1 year ago

can confirm, latest (2.6.4) works also for 1.25

jyothish516 commented 1 year ago

The fix was released in 2.3.7 and 2.4.0 onward.

i am using 2.6.1 argocd version but still issue is there

ravindraprasad85 commented 1 year ago

I am using 2.7.1 but same issue there, Is there any workaround for this ?

argocd: v2.7.1+5e54351.dirty BuildDate: 2023-05-02T19:02:07Z GitCommit: 5e543518dbdb5384fa61c938ce3e045b4c5be325 GitTreeState: dirty GoVersion: go1.20.3 Compiler: gc Platform: darwin/amd64

argocd-server: v2.7.1+5e54351.dirty BuildDate: 2023-05-02T16:35:40Z GitCommit: 5e543518dbdb5384fa61c938ce3e045b4c5be325 GitTreeState: dirty GoVersion: go1.19.6 Compiler: gc Platform: linux/amd64 Kustomize Version: v5.0.1 2023-03-14T01:32:48Z Helm Version: v3.11.2+g912ebc1 Kubectl Version: v0.24.2 Jsonnet Version: v0.19.1

abdennour commented 1 year ago

I am using 2.7.1 but same issue there, Is there any workaround for this ?

@ravindraprasad85 workaround already shared above

doyelese commented 1 year ago

Make sure your client is updated also not just your argocd server

On Sun, Jun 18, 2023 at 1:26 PM abdennour @.***> wrote:

I am using 2.7.1 but same issue there, Is there any workaround for this ?

@ravindraprasad85 https://github.com/ravindraprasad85 workaround already shared above

— Reply to this email directly, view it on GitHub https://github.com/argoproj/argo-cd/issues/9422#issuecomment-1596242847, or unsubscribe https://github.com/notifications/unsubscribe-auth/APQGOA6GC4USB634VP4DPMLXL5I4PANCNFSM5WCCKCRA . You are receiving this because you commented.Message ID: @.***>

-- Dejo Oyelese Helcim - Senior DevOps Developer

Email: @.*** Web: https://www.helcim.com Calgary Office: Suite 400 - 440 2 Avenue SW, Calgary, AB T2P 5E9. Seattle Office: Suite 4200 - 701 5th Avenue, Seattle, WA 98104

https://helcim.com/emailfooter/index.php

Rishabh04-02 commented 1 year ago

The fix was released in 2.3.7 and 2.4.0 onward.

Hey @crenshaw-dev we're facing this issue even in v2.7.5+a2430af.dirty.

Is the fix not stable?

josephvano commented 11 months ago

I experienced this issue on Argo CD v2.7.2

The workaround was as described above in two separate posts.

For completeness here is my solution.

My context was for local testing multiple clusters

Steps to solve

Create a kind cluster with an apiServerAddress that is accessible for your Argo CD instance (not localhost). Most likely your IP "192.x.x.x:8443"

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: dev-cluster
networking:
  # WARNING: It is _strongly_ recommended that you keep this the default
  # (127.0.0.1) for security reasons. However it is possible to change this.
  apiServerAddress: "<your-local-ip>"
  # By default the API server listens on a random open port.
  # You may choose a specific port but probably don't need to in most cases.
  # Using a random port makes it easier to spin up multiple clusters.
  apiServerPort: 8443

kind docs ref

kind create cluster --config config.yaml

Run the argocd command to add a cluster

argocd cluster add kind-dev-cluster

It will fail with a timeout. That's when have to switch to the kind dev cluster context and create the additional secret for the service account and associate the argocd-manager service account with the new secret.

In your dev-cluster context

kubectl config use-context kind-dev-cluster

Create service account secret

apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
  annotations:
    kubernetes.io/service-account.name: argocd-manager
  name: argocd-manager-token
  namespace: kube-system

Add secret to service account

apiVersion: v1
kind: ServiceAccount
metadata:
  creationTimestamp: "2023-10-10T15:02:41Z"
  name: argocd-manager
  namespace: kube-system
  resourceVersion: "1526"
  uid: 89721095-63b2-42d0-8dd9-29c2f9fe0379
secrets:
- name: argocd-manager-token
qixiaobo commented 9 months ago

Same here! Try workaround

qixiaobo commented 9 months ago

https://itnext.io/big-change-in-k8s-1-24-about-serviceaccounts-and-their-secrets-4b909a4af4e0