argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.73k stars 5.4k forks source link

repo-server github connectivity timeout error - unable to setup repositories #20062

Open bredamatt opened 4 weeks ago

bredamatt commented 4 weeks ago

Checklist:

Describe the bug

I have deployed ArgoCD in an EKS cluster created using private subnets and Cilium, and created a Secret for my private GitHub repository as per the documentation.

I check the logs on my repo-server, and I see the following:

repo-server time="2024-09-23T08:58:04Z" level=error msg="finished unary call with code Unknown" error="error testing repository connectivity: Get \"http │
│ s://www.github.com/<MY_ORG>/<MY_REPO>/info/refs?service=git-upload-pack\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" grpc │
│ .code=Unknown grpc.method=TestRepository grpc.service=repository.RepoServerService grpc.start_time="2024-09-23T08:57:49Z" grpc.time_ms=15000.817 span.ki │
│ nd=server system=grpc

I checked the ArgoCD UI and the connection to the repo is failed.

I checked outbound access from a test Pod in my EKS cluster, and I am able to ping and curl sites like github.com, google.com, etc. so there definitely seems to be outbound connectivity in my EKS cluster. How come the repo-server can't connect then?

To Reproduce I deploy ArgoCD version using terraform's helm provider with the helm chart (https://github.com/argoproj/argo-helm/blob/main/charts/argo-cd/values.yaml) version at 7.5.2 and I set the following values:

  set {
    name = "redis-ha.enabled"
    value = true
  }
  set {
    name = "controller.replicas"
    value = 2
  }
  set {
    name = "server.autoscaling.enabled"
    value  = true
  }
  set {
    name = "server.autoscaling.minReplicas"
    value = 2
  }
  set {
    name = "repoServer.autoscaling.enabled"
    value = true
  }
  set {
    name = "repoServer.autoscaling.minReplicas"
    value = 2
  }

I can see that health checks are fine on the repo-server logs. Also, to be certain, even though I already confirmed outbound access before deploying argo, I explicitly allow for all of Github's public IP addresses in the worker node security group. This doesn't influence connectivity from my test pod, but doesn't help the repo-server connectivity.

Expected behavior

Connectivity to github should not timeout.

Screenshots

Version 7.5.2 Helm chart.

Logs Repo-server start-up logs:

Stream closed EOF for argo-cd/argo-cd-argocd-repo-server-bfd6d8557-8wkhw (copyutil)                                                                                                                                                               │
│ cmp-helmfile time="2024-09-23T08:53:42Z" level=info msg="ArgoCD ConfigManagementPlugin Server is starting" built="2024-08-27T11:57:48Z" commit=6b9cd828c6e9807398869ad5ac44efd2c28422d6 version=v2.12.3+6b9cd82                                   │
│ cmp-helmfile time="2024-09-23T08:53:42Z" level=info msg="argocd-cmp-server v2.12.3+6b9cd82 serving on /home/argocd/cmp-server/plugins/cmp-helmfile-v1.0.sock"                                                                                     │
│ cmp-helmfile time="2024-09-23T09:03:42Z" level=info msg="Alloc=8400 TotalAlloc=20589 Sys=26453 NumGC=7 Goroutines=10"                                                                                                                             │
│ cmp-helmfile time="2024-09-23T09:13:42Z" level=info msg="Alloc=8400 TotalAlloc=20591 Sys=26453 NumGC=12 Goroutines=10"                                                                                                                            │
│ copyutil /bin/cp: warning: behavior of -n is non-portable and may change in future; use --update=none instead                                                                                                                                     │
│ Stream closed EOF for argo-cd/argo-cd-argocd-repo-server-bfd6d8557-8wkhw (cmp-helmfile)                                                                                                                                                           │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="ArgoCD Repository Server is starting" built="2024-08-27T11:57:48Z" commit=6b9cd828c6e9807398869ad5ac44efd2c28422d6 port=8081 version=v2.12.3+6b9cd82                                      │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="Generating self-signed TLS certificate for this session"                                                                                                                                  │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="Initializing GnuPG keyring at /app/config/gpg/keys"                                                                                                                                       │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="gpg --no-permission-warning --logger-fd 1 --batch --gen-key /tmp/gpg-key-recipe579463391" dir= execID=cc78c                                                                               │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg=Trace args="[gpg --no-permission-warning --logger-fd 1 --batch --gen-key /tmp/gpg-key-recipe579463391]" dir= operation_name="exec gpg" time_ms=279.96838299999996                          │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="Populating GnuPG keyring with keys from /app/config/gpg/source"                                                                                                                           │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="gpg --no-permission-warning --list-public-keys" dir= execID=f9dfc                                                                                                                         │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg=Trace args="[gpg --no-permission-warning --list-public-keys]" dir= operation_name="exec gpg" time_ms=2.816586                                                                              │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="gpg --no-permission-warning -a --export 834171B6FD23DA43" dir= execID=f7783                                                                                                               │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg=Trace args="[gpg --no-permission-warning -a --export 834171B6FD23DA43]" dir= operation_name="exec gpg" time_ms=2.151414                                                                    │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="gpg-wrapper.sh --no-permission-warning --list-secret-keys 834171B6FD23DA43" dir= execID=3be11                                                                                             │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg=Trace args="[gpg-wrapper.sh --no-permission-warning --list-secret-keys 834171B6FD23DA43]" dir= operation_name="exec gpg-wrapper.sh" time_ms=4.664008                                       │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="Loaded 0 (and removed 0) keys from keyring"                                                                                                                                               │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="argocd-repo-server is listening on :8081"                                                                                                                                                 │
│ repo-server time="2024-09-23T08:53:42Z" level=info msg="Starting GPG sync watcher on directory '/app/config/gpg/source'"                                                                                                                          │
│ repo-server time="2024-09-23T08:54:00Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:54:00Z" grpc.time_ms=0.048 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:54:10Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:54:10Z" grpc.time_ms=0.024 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:54:20Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:54:20Z" grpc.time_ms=0.018 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:54:30Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:54:30Z" grpc.time_ms=0.018 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:54:40Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:54:40Z" grpc.time_ms=0.019 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:54:50Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:54:50Z" grpc.time_ms=0.018 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:55:00Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:55:00Z" grpc.time_ms=0.02 span.kind=server system=grpc │
│ repo-server time="2024-09-23T08:55:10Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:55:10Z" grpc.time_ms=0.026 span.kind=server system=grp │
│ c                                                                                                                                                                                                                                                 │
│ repo-server time="2024-09-23T08:55:20Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time="2024-09-23T08:55:20Z" grpc.time_ms=0.018 span.kind=server system=grp │
│ c
agaudreault commented 5 days ago

were your test pods in the same namespace as argo cd and on the same node? From my personal experience, this has always been a problem without outbound rules external to argo-cd. You can validate if you have a nodeSelector or something like that and the argo pods are deployed on a node without the permissions. Also, the public GitHub IPs are listed in https://api.github.com/meta, just to be sure you used the correct ones.