Open rzvn2600 opened 2 years ago
I am also getting error
Failed to pull image "quay.io/argoproj/argocd:v2.4.6": rpc error: code = Unknown desc = failed to pull and unpack image "quay.io/argoproj/argocd:v2.4.6": failed to resolve reference "quay.io/argoproj/argocd:v2.4.6": failed to do request: Head https://quay.io/v2/argoproj/argocd/manifests/v2.4.6: proxyconnect tcp: dial tcp 10.224.2.11:9119: i/o timeout
Any idea how to debug this please ?
I had a similar issue with the v2.4.7 version of install.yaml. It seems to indicate that the argocd server pod can't communicate with the argo repo server pod. For me the fix was to delete/disable all the network policies, after a few minutes everything started working.
Hi, getting something similar when I tried adding a gitlab repo with argocd 2.4.9. Either with ssh or https.
argocd repo add git@gitlab.com:/account-name/yaml-holder.git --ssh-private-key-path ./argocd_id_rsa
FATA[0132] rpc error: code = Unknown desc = error testing repository connectivity: dial tcp 172.65.251.78:22: connect: connection timed out
argocd repo add https://gitlab.com/account-name/yaml-holder.git --username myuser--password mypassword
FATA[0015] rpc error: code = Unknown desc = error testing repository connectivity: Get "https://gitlab.com/account-name//yaml-holder.git/info/refs?service=git-upload-pack": dial tcp 172.65.251.78:443: i/o timeout (Client.Timeout exceeded while awaiting headers)
I am also able to clone into the pod...
Any ideas?
Having this issue as well and it defintely points to an outright inability for ArgoCD to reach its own repo server.
From an argocd app sync
we receive ComparisonError: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 172.xx.xx.xx:8081: i/o timeout"
While a k logs deploy/argocd-repo-server
doesn't show any evidence at all the repo server was actually reachable or received any rpc request.
I ran into this issue with Argo CD deployed to an EKS cluster managed by Terraform. The problem was that I had configured the managed node group defaults in Terraform with attach_cluster_primary_security_group = false
. This meant that the cluster security group was not attached to the nodes. The only rule in the cluster security group was a self-referential rule allowing all traffic.
Attaching the EKS-created cluster security group to the nodes by modifying the Terraform configuration resolved the issue for me.
Any updates? Mine cannot work even with all NetworkPolicy disabled.
DavidKittleSEL's solution does not fit a production scenario but on a test cluster it's fine. I also deleted the -n argocd pods afterwards. Maybe deleting the application-controller pod is enough (see https://github.com/argoproj/argo-cd/issues/10666#issuecomment-1277424370). Then after a few minutes, everything settled.
I installed argocd like this
kubectl create ns argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
This installed v2.11.3+3f344d5 for me and with this version I'm having this issue.
Is this still an issue in newer versions? I see this is for v2.4.6
but that's currently not supported (and quite old). Does this exist in v2.9+?
Is this still an issue in newer versions? I see this is for
v2.4.6
but that's currently not supported (and quite old). Does this exist in v2.9+?
See my edited comment
Having similar issues, and seeing it happens when repo-server scales down (HPA).
With notifications enabled on Unknown
it will report it and you can compare it to the scaling actions.
Have the same error with v2.9.18+151ee6a on minikube (podman)
argocd repo add https://github.com/argoproj/argocd-example-apps.git
FATA[0020] rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup argocd-repo-server: i/o timeout"
UPD1
I using minikube on podman and just realized that my installation is not healthy
Failed to inspect image "redis:7.0.15-alpine": rpc error: code = Unknown desc = short-name "redis:7.0.15-alpine" did not resolve to an alias and no unqualified-search registries are defined in "/etc/containers/registries.conf"
Because of podman doesn't resolve short names. See this.
UPD2 Go to deployments and change
image: docker.io/redis:7.0.15-alpine
I fixed it: The problem was that I was running Flannel CNI. I tried Calico for the CNI now and everything works like a charm. To install Calico, the best and easiest way seems to be helm:
from: https://docs.tigera.io/calico/latest/getting-started/kubernetes/helm
helm repo add projectcalico https://docs.tigera.io/calico/charts
kubectl create namespace tigera-operator
helm install calico projectcalico/tigera-operator --version v3.28.0 --namespace tigera-operator
Then it takes about a minute until everything is running and ready.
It should be mentioned in the argocd requirements/readme that it doesn't work with Flannel
I am getting the same error on a k3s multinode cluster (v1.30.3+k3s1) and using argocd version v2.12.3+6b9cd82. So far restarting pods has had no affect. I don't really want to be deleting network policies either.
I probably don't understand the problem but grep does not see any mention of "argocd-redis-ha-haproxy" or "haproxy" in the install file. That is in reference to the error:
Unable to load data: error getting cached app managed resources: dial tcp: lookup argocd-redis-ha-haproxy on 10.43.0.10:53: no such host
I also can't see any service with such a name so not sure why it is trying to lookup that specific name and not sure where that "name" would be created in the first place.
Checklist:
argocd version
.Describe the bug
Hi All,
I am tryinf to deploy an application ito a fresh argocd installation and it does not work. I cannot create an app from the UI/CLI or through kubectl apply. Everything fails with the following error:
Pod stats:
kubectl logs:
argocd-server
argocd-repo-server different app deployed:
CLI logs from app deployed through kubectl:
app.yaml:
But from the POD level I can easily clone the repo without any credentials. no networking issue:
Basically I cannot even create any app inside argocd because of the above error. To Reproduce
Follow the installation step by step from the following argocd "getting started" documentation: https://argo-cd.readthedocs.io/en/stable/getting_started/
Expected behavior
It should add at least the example repository from the documentation and start working but instead I can see the error mentioned above.
Version
Logs
Please I will have to remind you just in case that I can connect to any pod from the argocd deployemnts and directly clone the above mentioned xxx repository manually without any issues. There is no network policy that forbids access.
I have tried to re-deploy the whole argocd in EKS successfully. I have tried to inject the env variables (_ARGOCD_REPO_SERVER_LOGLEVEL=debug && ARGOCD_REPO_SERVERLOGLEVEL=debug) to get more inshigh into the issue but without any result. The same error appears.