argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
16.7k stars 5.06k forks source link

Dex SSO failing with "http: server gave HTTP response to HTTPS client" #9998

Open ab-arao opened 1 year ago

ab-arao commented 1 year ago

Checklist:

Describe the bug

Receiving following error message when trying to login with SSO to argocd:

Failed to query provider "https://<server>/api/dex": Get "https://argocd-dex-server:5556/api/dex/.well-known/openid-configuration": http: server gave HTTP response to HTTPS client

I'm wondering if it could be related to any of the following recent changes:

We are using the bundled dex instance.

I've tried changing the oidc.tls.insecure.skip.verify setting to true as suggested but haven't seen any improvement. I'm considering pinning all the container images to 2.4.4 in the cluster but wanted to see if anyone else was seeing this or had thoughts first.

To Reproduce

We see this error whenever we click on the SSO login button in the UI.

Expected behavior

Login should occur.

Version

We do not have the default admin login turned on and because our CLI login uses SSO, I'm not able to get the actual output.

However our manifests are configured to use the latest version of argocd and so I would think we are on 2.4.5 or 2.4.6.

Updated version info after getting SSO working again with the changes below

$ argocd version --grpc-web
argocd: v2.1.6+a346cf9.dirty
  BuildDate: 2021-11-01T02:06:48Z
  GitCommit: a346cf933e10d872eae26bff8e58c5e7ac40db25
  GitTreeState: dirty
  GoVersion: go1.17.2
  Compiler: gc
  Platform: darwin/amd64
argocd-server: v2.4.0+7d31d61
  BuildDate: 2022-07-14T21:08:57Z
  GitCommit: 7d31d612ef2309a4d6dfef3a47e565af4742586f
  GitTreeState: clean
  GoVersion: go1.18.4
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v4.5.5 2022-05-20T20:25:40Z
  Helm Version: v3.9.0+g7ceeda6
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.18.0

Logs

Unable to find any relevant logs.

crenshaw-dev commented 1 year ago

I bet it was https://github.com/argoproj/argo-cd/security/advisories/GHSA-7943-82jg-wmw5.

https://github.com/argoproj/argo-cd/issues/9424 wasn't (and won't be) backported to 2.4.

Have you set server.dex.server in argocd-cmd-params-cm?

ab-arao commented 1 year ago

Thanks very much for having a look @crenshaw-dev

The argocd-cmd-params-cm configmap is empty, so no. Could you let me know what that's supposed to be set to? Based on the code I'm guessing the "http" url of the dex server instead of the "https" one in the error message, like the following:

data:
  server.dex.server: "http://argocd-dex-server:5556/"

I did try setting dex.server.disable.tls: true in argocd-cm but it sounds like you are saying that setting is a red herring. I also did try editing the argo-server deployment to use v2.4.4 but the UI didn't load.

ab-arao commented 1 year ago

Thanks to @crenshaw-dev we appear to be up and running again with doing the following:

crenshaw-dev commented 1 year ago

@ab-arao It's interesting that you had to set that. It should be the default: https://github.com/argoproj/argo-cd/blob/release-2.4/common/common.go#L17

The OIDC client in Argo CD's "session manager" includes a URL rewriter - each OIDC request to https://argocd.example.com/api/dex/* actually ends up being sent to the address at server.dex.server (or, the fallback default linked above).

Since somehow the schema in that rewritten URL ended up being HTTPS, and Dex currently only serves HTTP, you got the above error. How the heck that schema got rewritten to be https instead of http, I have no idea.

At any rate, glad you got it working!

ab-arao commented 1 year ago

@crenshaw-dev

This started happening again just now. The command params configmap still shows the http dex server:

$ kubectl describe configmap argocd-cmd-params-cm -n argocd
Name:         argocd-cmd-params-cm
Namespace:    argocd
Labels:       app=argocd-config
              app.kubernetes.io/instance=argocd
              app.kubernetes.io/name=argocd-cmd-params-cm
              app.kubernetes.io/part-of=argocd
              env=internal
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"argocd-config","app.kubernetes.io/instance":"argocd","...

Data
====
server.dex.server:
----
http://argocd-dex-server:5556/
Events:  <none>

I'm trying to identify if something else has changed today or over the weekend.

crenshaw-dev commented 1 year ago

This is very very weird. I've seen one other case where the Dex URL seemed to change with no reason (in that case, the hostname changed as well, not just the schema).

The URL is set when the dex round tripper is created, which happens when the session manager is created, which happens when the server is created by the CLI. The CLI gets the Dex address from the --dex-server arg or env var which is not set by default and falls back to http://argocd-dex-server:5556.

As far as I can tell, nothing mutates the rewrite URL.

Are you getting the error with the UI each time, or are you now seeing it on the CLI?

ab-arao commented 1 year ago

Thanks for your fast response again @crenshaw-dev, really appreciate it.

We are getting the error in the UI as well as the CLI:

$ argocd login <server> --sso --grpc-web
FATA[0000] Failed to query provider "https://<server>/api/dex": 400 Bad Request: Client sent an HTTP request to an HTTPS server.
ab-arao commented 1 year ago

The function I keep returning to when trying to step through the code here is this: https://github.com/argoproj/argo-cd/blob/a2a6d9d6947a3c5b8894f372766bb762c2a588ed/util/dex/dex.go#L126

That function is referenced in several places. I can't tell if it would be possible for there to be some kind of condition in that results in the string being returned with https:// prepended to it.

The condition would be the result of there being inconsistent access to the command line parameters config map or something like that.

crenshaw-dev commented 1 year ago

My bad, some of my links were to the latest code. Here's the file for the version you're running: https://github.com/argoproj/argo-cd/blob/v2.4.0/util/dex/dex.go

The http/https switch is new for 2.5. It should always be http in your version.

ab-arao commented 1 year ago

I turned on some kind of SSO debugging using the ARGOCD_SSO_DEBUG: 1 environment variable in the argocd_server deployment manifest, and now I'm seeing the following in the logs:

time="2022-07-18T20:40:32Z" level=info msg="Initializing OIDC provider (issuer: https://<server>/api/dex)"
time="2022-07-18T20:40:32Z" level=info msg="GET /api/dex/.well-known/openid-configuration HTTP/1.1\r\nHost: <server>\r\n\r\n"
time="2022-07-18T20:40:32Z" level=info msg="HTTP/1.0 400 Bad Request\r\nConnection: close\r\n\r\nClient sent an HTTP request to an HTTPS server.\n"

Based on searching that initial log message, it looks like the provider is failing to initialize as a result of this line failing: https://github.com/argoproj/argo-cd/blob/62a6c7ae550a724445ac26c027391a7f26ce63af/util/oidc/provider.go#L62

This file talks about lazy-loading the provider configuration in order to avoid a chicken-and-egg problem, which it seems like might be happening here.

Is it possible to edit the argocd-dex-server deployment here to pass in additional flags to rundex in order to make sure that the provider gets the correct configuration for the dex server.

Or would setting the ARGOCD_DEX_SERVER_DISABLE_TLS environment variable help?

ab-arao commented 1 year ago

If I change the url in the argocd-cm ConfigMap to not have the protocol, i.e. just <server> instead of https://<server> I now get the following error message when I click the SSO login button:

Failed to query provider "<server>/api/dex": Get "http://argocd-dex-server:5556/<server>/api/dex/.well-known/openid-configuration": EOF

In contrast, before when url was set to https://<server> I got the previous error:

Failed to query provider "https://<server>/api/dex": 400 Bad Request: Client sent an HTTP request to an HTTPS server.

And setting the url to http://<server> gives the following error:

Failed to query provider "http://<server>/api/dex": 400 Bad Request: Client sent an HTTP request to an HTTPS server.

The settings being passed into https://github.com/argoproj/argo-cd/blob/62a6c7ae550a724445ac26c027391a7f26ce63af/util/oidc/provider.go#L63 seem to be the problem here.

ab-arao commented 1 year ago

Continuing to play with configuration permutations, I tried setting both the url in argocd-cm to https://<server> and server.dex.server in argocd-cmd-params-cm to https://argocd-dex-server:5556.

This resulted in me being able to log in using SSO, except there was some strange behavior:

What is even more interesting is that now this login process works regardless of whether the server.dex.server in argocd-cmd-params-cm is set to http or https.

The login works through the cli as well.

I notice that we are now on a different version of argocd server:

$ argocd version
argocd: v2.1.6+a346cf9.dirty
  BuildDate: 2021-11-01T02:06:48Z
  GitCommit: a346cf933e10d872eae26bff8e58c5e7ac40db25
  GitTreeState: dirty
  GoVersion: go1.17.2
  Compiler: gc
  Platform: darwin/amd64
argocd-server: v2.4.0+708906d
  BuildDate: 2022-07-16T16:32:43Z
  GitCommit: 708906d063cf737f37421c6ac6111b0b1dd5123f
  GitTreeState: clean
  GoVersion: go1.18.4
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v4.5.5 2022-05-20T20:25:40Z
  Helm Version: v3.9.0+g7ceeda6
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.18.0

Now we are on v2.4.0+708906d whereas when the initial issue came up we were on v2.4.0+7d31d61.

We are in a better state than at the start of the day because we can log in again, but I'd appreciate any guidance on how we can restore the normal SSO login flow and keep it stable.

crenshaw-dev commented 1 year ago

What is even more interesting is that now this login process works regardless of whether the server.dex.server in argocd-cmd-params-cm is set to http or https.

Yeah, that makes no sense to me, because the OIDC client should be using whatever schema is set in server.dex.server.

Apologies for the slow response, coming back to this tomorrow.

ab-arao commented 1 year ago

We updated again and I had to run through the restart dance to restore the janky login.

Current version:

➜  ~ argocd version
argocd: v2.1.6+a346cf9.dirty
  BuildDate: 2021-11-01T02:06:48Z
  GitCommit: a346cf933e10d872eae26bff8e58c5e7ac40db25
  GitTreeState: dirty
  GoVersion: go1.17.2
  Compiler: gc
  Platform: darwin/amd64
argocd-server: v2.4.0+e343d3b
  BuildDate: 2022-07-22T15:58:46Z
  GitCommit: e343d3bb15e35917e1c935375078a342a90e63cd
  GitTreeState: clean
  GoVersion: go1.18.4
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v4.5.5 2022-05-20T20:25:40Z
  Helm Version: v3.9.0+g7ceeda6
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.18.0

What appears to work is the following:

  1. edit server.dex.server in argocd-cmd-params-cm to be whichever of http or https it isn't already
  2. kubectl rollout restart deployment argocd-server -n argocd

Then it is possible to log in with the following steps as before:

  1. click login
  2. wait to get redirected back to the page https://<server>/auth/callback?code=<code>&state=<state>
  3. then navigate to https://<server> in that same browser as the cookie for argocd.token does get set properly
crenshaw-dev commented 1 year ago

I don't follow the last two. What does it mean to navigate to https://? Looks like you're already redirected to an https:// URL.

ab-arao commented 1 year ago

@crenshaw-dev sorry, the formatting got messed up for some reason.

Basically the redirect does not result in the argocd ui being loaded. Instead we see a page that just shows the jwt token from our IdP.

However, the redirect does add the correct cookie, so if we rewrite the url in the browser to being the argocd homepage we are able to log in.

tonedefdev commented 1 year ago

I use the official argo-cd Helm chart to deploy the application and recently started having this same error being reported here after upgrading to v2.4.18 from v2.4.14. To fix the issue I had to set this in the Helm chart:

argo-cd:
    params:
      dexserver.disable.tls: true

This ensured all future deployments of the Helm chart forced dex to use the http endpoint instead of https endpoint by setting the ConfigMap values for argocd-cmd-params-cm to the following:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cmd-params-cm
data:
<-- data omitted for brevity -- >
  dexserver.disable.tls: "true"
  server.dex.server: http://release-name-argocd-dex-server:5556
  server.dex.server.strict.tls: "false"
<-- data omitted for brevity -- >

I hope this helps someone if they stumble upon this issue and are using the official Helm chart.

crenshaw-dev commented 1 year ago

If you are using Argo CD <v2.5.0, then @tonedefdev's solution above is acceptable. But please make a note that when you upgrade to >=v2.5.0, you should remove these settings to take advantage of HTTPS communication between the API server and Dex.

If you are using Argo CD >=v2.5.0, please do not disable TLS as a fix for this error message. TLS support was added because it is a very important security mechanism for a very sensitive communication channel.

This error message occurs for one of two reasons. Either: 1) "http: server gave HTTP response to HTTPS client": The API server expected to communicate via HTTPS, but Dex expected to communicate via HTTP. 2) "http: server gave HTTPS response to HTTP client": The API server expected to communicate via HTTP, but Dex expected to communicate via HTTPS.

This configuration mismatch is a solveable problem, and it should be solved by identifying the mismatch rather than hacked around by making things less secure.

A common configuration mismatch is when folks manually set global.image.tag in the argo-helm argo-cd chart. This may cause the Argo CD image to not match the manifests.

There are a thousand other possible ways to produce a mismatch, so I can't enumerate the solutions here. @tonedefdev found a valid solution was manually disabling TLS. This is valid, because their version of Argo CD does not support TLS communication between the API server and Dex. But if they were using 2.5.0+, they would need to dig deeper to find the config mismatch.

Drezir commented 1 week ago

I also have this problem.

I have keycloak behind ingress with cert manager lets encrypt When I access the argocd web UI, click on keycloak, I get this message.

I tried event to publish argocd-dex server behind ingress https with letsencrypt and configure it in argocd config map, but it does not work