Open praveenperera opened 2 years ago
I also ran into this, the liveness probe didn't seem to help.
Failed to query provider "https://argocd.mysite.com/api/dex": Get "http://argo-cd-argocd-dex-server:5556/api/dex/.well-known/openid-configuration": dial tcp 172.20.233.60:5556: connect: connection refused
@sarasensible it's surprising that the liveness probe didn't work. Double check the configuration?
Yeah for some reason my Argo CD chart isn't picking up any changes I make to the dex configuration, so it's possible this could work and it's just not deploying. Very confusing.
Update: confirmed this was an issue with how I was deploying - dumb mistake recorded for all time in https://github.com/helm/helm/issues/10880#issuecomment-1106722961
i got this as well on OpenShift - specifically when cluster has been stopped, then restarted
time="2022-07-22T09:08:02Z" level=info msg="config refresh tokens rotation enabled: true"
failed to initialize server: server: Failed to open connector openshift: failed to open connector: failed to create connector openshift: failed to query OpenShift endpoint Get "https://kubernetes.default.svc/.well-known/oauth-authorization-server": dial tcp: i/o timeout
Restarting dex pod fixes it .. ah yes .. not liveness check on the Deployment .. that will help!
Also reproduced it :(
This happened again for me on upgrade to chart version 5.20.4 which is 2.6.1 app version because somewhere along the line extraVolumes and extraVolumeMounts were renamed to volumes and volumeMounts in the helm chart. Fixed by renaming the fields in my helm config.
Same problem, happened twice in the last month, running on the latest argo cd version (currently 2.6.2) with the latest helm chart. Fixed by restarting Dex. Besides these two times the google login works perfectly.
Last log message from dex:
failed to initialize server: server: Failed to open connector google: failed to open connector: failed to create connector google: failed to get provider: Get "https://accounts.google.com/.well-known/openid-configuration": dial tcp: lookup accounts.google.com on IP_ADDRESS: read udp IP_ADDRESS:PORT->IP_ADDRESS:PORT: read: connection refused
same issue here in 2023 but with GitHub auth. still relevant.
By providing dex config in following manner, then updating the argocd-cm configmap & restarting the argocd-dex-server-xyz123 pod worked for me like a charm.
dex.config: |
connectors:
- config:
issuer: https://accounts.google.com
redirectURI: https://argocd.example.com/api/dex/callback
clientID: abc-xys.apps.googleusercontent.com
clientSecret: abc-XYZ_123
serviceAccountFilePath: /tmp/oidc/googleAuth.json
adminEmail: name@example.com
type: oidc
id: google
name: Google
Observed this behavior after upgrading from K8S 1.26 to 1.27. All pods were restarted by the upgrade process but after the upgrade the dex deployment had to be rolled again for Google SSO to work.
I've been using this for a while and I have noticed, this is reproducible every time a spot node comes up and this pod gets scheduled there. every time I get
Failed to query provider "https://argocd.<masked>.europe-west3-gcloud.internal.<masked>.io/api/dex": Get "https://argocd-dex-server:5556/api/dex/.well-known/openid-configuration": dial TCP <PRIVATE_IP>:5556: connect: connection refused
until everytime i had to restart this service and everything works as charm , like when i was using on-demand nodes instead of spot.
I can confirm this issue is reproduced on spot nodes in GCP with google SSO
Same here using the standard Helm chart. Killing the dex-server pod fixes the problem.
Same here using the standard Helm chart. Killing the dex-server pod fixes the problem.
We have the same issue in GCP preemptible (spot) instances with helm chart version 6.10.2.
The same issue, my solution was the same as proposed + startupProbe to /healthz. Here is the kustomize patch (if you do not use helm):
apiVersion: apps/v1
kind: Deployment
metadata:
name: argocd-dex-server
spec:
template:
spec:
containers:
- name: dex
startupProbe:
failureThreshold: 3
httpGet:
path: /api/dex/healthz
port: 5556
scheme: HTTPS
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
livenessProbe:
failureThreshold: 3
httpGet:
path: /api/dex/.well-known/openid-configuration
port: 5556
scheme: HTTPS
initialDelaySeconds: 30
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
Same issue here with argocd 2.10.9 in EKS
Looks like the same issue with chart 7.6.12
.
We had the same issue with 7.6.12 but the fix was to comment out the dex version in our values.yaml override which made it upgrade dex to a newer version
Checklist:
argocd version
.Describe the bug
Randomly the dex server will stop responding. When you go to login with google you get this error on web page:
In the pod
To Reproduce
It happens randomly. But setup argo cd with google auth and wait.
Version
v2.3.1+b65c169, but I've also seen it in older versions.
Restart argo-cd dex workload fixes it, but then it will appear again after sometime. My current fix has been to setup a liveness probe. I can open a PR.