Open kingdonb opened 1 year ago
Add to this, I think this page is out of date:
https://docs.gitops.weave.works/docs/configuration/tls/
I don't find serverTLS
among the values that I can choose from here:
https://github.com/weaveworks/weave-gitops-enterprise/blob/main/charts/mccp/values.yaml
(A separate issue is that the repo for weave-gitops-enterprise chart is not public, so I'm not sure how users are supposed to discover what other values they can input there, if they need something that isn't specifically explained in a particular doc.)
I tried enabling ingress as it was explained, and I got persistent errors from the backend which is apparently still serving TLS
Reading enough different sources, I am eventually able to figure out that I need
values:
tls:
enabled: false
I finally landed at an enabled ingress which reaches the backend successfully and is doing TLS termination, but the error now is from clicking on the OIDC login button:
{"message":"oidc provider not configured","code":400}
Aha. I was thinking there must be something missing, (this is another bit that's evidently not covered in the doc.)
I'll see what I can figure out by poking around in values.yaml, I see the docs have prompted me to set up GitHub so it knows how to find Weave GitOps callback, but I don't see where I tell Weave GitOps which oauth provider or how to contact it.
I eventually got it working through sheer force of will. I figured out eventually that the staticClients information in Dex's configuration should match what Weave GitOps Enterprise's clientCredentialsSecret
has listed under clientID and clientSecret (not clientId
as the values.yaml had indicated, that was very difficult to figure everything out.)
I'm going to do my best to forget what I learned here so I can still viably test the workflow after improving the docs, because there was definitely a bit missing and I had to make a few flying leaps to get to a working config.
Most surprising of all, when I created a team in my GitHub org, and added myself to it, and Dex's configuration was finally correct, I got an error stating that I was in the correct org but not any teams. Only when I added a sub-team to the team I had created, did Dex let me in. I am not sure why the top-level team did not count as a team, but I did eventually get everything connected, (Can see what it looks like when RBAC is disconnected - perhaps this should be a stop on the tour, to highlight how RBAC and Dex work - I got locked out of Kubernetes even though I was able to get logged into Weave GitOps, because I thought that I could just change the team name, but that was mismatched vs what had been configured in RBAC!)
Finally, at the end of all this, I think that I should be able to use Dex to access the cluster directly, with whatever roles I've been given via RBAC... but I have unfortunately no idea how to do that, or where to look next for more details about that.
I did get RBAC connected via Dex today though, so ๐ this is really cool, and I feel like I solved a massive puzzle box!
Some breaking changes and some improvements to the OIDC workflow seem to have landed this release (0.16)
Today I found my upgraded WG instances in crashloop, because I had disabled the admin user creation but enabled the OIDC. IDK if that is a bug to report, but when I replaced the admin user (which was not possible to have in place at the same time as OIDC configuration, in the prior release) everything started working again!
I think I can talk a bit more broadly about the OIDC user experience now, after doing the full setup for WGE and seeing what that's like with a multi-cluster configuration (the rbac
module in clusters/bases
and how that's used to enable WGE admin access across clusters) I'm spending some time with the WG OSS now, to get a feel for how that experience is like when you are a basic user and your platform team provisioned you just one cluster, with WG OSS installed on it.
I didn't want to spend a bunch of time mucking around in the Dex installation guide before I could say I've done all that. But now I have done all that. I'm not comfortable taking the lead on this right now, but if anyone is working on this doc and has questions, I can definitely help with reviews or otherwise!
@kingdonb would you mind pasting the logs from the crash-looping pod? I have OIDC configured on my personal cluster and the upgrade went through just fine.
@makkes Do you have an admin user defined on your WG instance alongside of OIDC? Those were the conditions (not having any wego-admin user defined) that led my instance to crash looping, I only had it set up this way because before, it seemed that I could not get OIDC working while the admin user was also defined. I guessed they fixed it, and just undid what I had done. I don't think there is any bug, but I'll see if I can reproduce the crash, as I've lost the logs you're looking for now and they might still be relevant while this is at least this fresh in my memory...
Here you go:
(โ|howard-space:flux-system):~ (macos-homebrew *|u=)$ k logs -f ww-gitops-weave-gitops-7f55bd8fcd-m9tng
2023-02-06T22:50:11.619Z INFO gitops cmd/cmd.go:130 Version {"version": "v0.16.0", "git-commit": "8d8a6165", "branch": "HEAD", "buildtime": "2023-02-01_12:33:34"}
I0206 22:50:12.672255 1 request.go:682] Waited for 1.012906257s due to client-side throttling, not priority and fairness, request: GET:https://10.108.113.47:443/apis/authorization.k8s.io/v1?timeout=32s
Error: could not initialise authentication server: could not create auth server: could not get secret for cluster user, secrets "cluster-user-auth" not found
Usage:
[flags]
Flags:
--auth-methods strings Which auth methods to use, valid values are user-account,oidc (default [user-account,oidc])
--custom-oidc-scopes strings Customise the requested scopes for then OIDC authentication flow - openid will always be requested (default [openid,offline_access,email,groups])
--enable-metrics Starts the metrics listener
-h, --help help for this command
--host string UI host (default "0.0.0.0")
--insecure do not attempt to read TLS certificates
--log-level string log level (default "info")
--metrics-address string If the metrics listener is enabled, bind to this address (default ":2112")
--mtls disable enforce mTLS
--notification-controller-address string the address of the notification-controller running in the cluster
--oidc-client-id string The client ID for the OpenID Connect client
--oidc-client-secret string The client secret to use with OpenID Connect issuer
--oidc-groups-claim string JWT claim to use as the user's group. If the claim is present it must be an array of strings (default "groups")
--oidc-issuer-url string The URL of the OpenID Connect issuer
--oidc-redirect-url string The OAuth2 redirect URL
--oidc-secret-name string Name of the secret that contains OIDC configuration (default "oidc-auth")
--oidc-token-duration duration The duration of the ID token. It should be set in the format: number + time unit (s,m,h) e.g., 20m (default 1h0m0s)
--oidc-username-claim string JWT claim to use as the user name. By default email, which is expected to be a unique identifier of the end user. Admins can choose other claims, such as sub or name, depending on their provider (default "email")
--path string Path url
--port string UI port (default "9001")
--tls-cert-file string filename for the TLS certificate, in-memory generated if omitted
--tls-private-key-file string filename for the TLS key, in-memory generated if omitted
--use-k8s-cached-clients Enables the use of cached clients
Error: could not initialise authentication server: could not create auth server: could not get secret for cluster user, secrets "cluster-user-auth" not found
Again, I only disabled the admin user because it couldn't be used together with OIDC at the time, if I remember correctly. I can imagine a user who would want to disable the admin user, but if I set values.adminUser.create: false
this is the result.
Maybe that's a bug?
Aah, I see what happened there and the chart docs/values aren't explicit about it: If you want to exclusively rely on OIDC you need to disable the Secret-based authentication by setting this flag on the weave-gitops Deployment:
--auth-methods=oidc
You achieve this by supplying that flag through the additionalArgs
Helm chart value:
kind: HelmRelease
[...]
spec:
values:
additionalArgs:
- --auth-methods=oidc
This effectively disables the Secret-based authentication and should lead to your Pod to come up.
That definitely worked, but all that being necessary was unexpected and as you say slightly undocumented. I'm not sure how it should work, but I'd expect this additionalArg to get added automatically when I set adminUser.create: false
because that setting change sets the expectation that there is no admin user, or password authentication at all (for me at least.)
If there isn't any way to set that expectation on the behavior clearly for the user, then I guess I'd expect there to be one. On top of that, if password authentication is disabled I'd expect to see the username and password prompts disappear as well.
There is a workaround as you mentioned so at least I don't have to set an admin password anymore, and now we're underway with a refresh of the Dex + OIDC docs, so this all can be addressed by that effort! Thanks for this ๐
I agree with all that you're saying. Unfortunately it's a little more complicated and I'd love to strip the chart of some of its complexity soon. I'll keep that on my radar and create a follow-up issue for it this week.
When you set adminUser.create: false
then this does not necessarily mean "I want to completely disable user/password authentication" because it can also mean "I'm managing the cluster-user-auth Secret myself". The bool flag doesn't give you the chance to specify your expectation around that, though, so as it is implemented now the assumption is it means the latter.
I would like to avoid adding another boolean flag for disable username/password authentication and rather make the whole authentication configuration much more comprehensible.
You can get a glimpse of how complex the chart has gotten in this PR that merely makes the Secret name configurable. ๐
cc #3389
Starting from this doc: https://docs.gitops.weave.works/docs/guides/setting-up-dex/
The docs for Weave GitOps covering Dex and OIDC have some unexplained parts:
https://github.com/weaveworks/weave-gitops/blob/7855a6b1e30efb52f932bcae43b525cdc958c90b/website/versioned_docs/version-0.14.1/guides/setting-up-dex.md#L113
In particular I noted that there is a secret under
staticClients
and the wholestaticClients
section is never really explained.I went to Dex looking for guidance about what this is, and all I could come up with was: https://github.com/dexidp/helm-charts/blob/b5e75c51c0a09edcf90bb2ab81baed4ef5c6a7b9/charts/dex/values.yaml#L50-L52
So I kept digging and I found this example dev config, where there's a similar config with a secret embedded in it: https://github.com/dexidp/dex/blob/a857160fe91d1ab35ce37352b83e77fc922701cf/examples/config-dev.yaml#L106-L119
At least I'm assuming that's meant to be secret, but there's no specific instructions about it; should users set this to something specific to their installation, or is it meant to be randomized? The secret handling in general is kind of hands-off in these docs, I think with good reason, because otherwise we're going to find ourselves re-explaining how to properly handle a secret in every doc that uses secrets. But in general I think this is also another topic we'll need to address, (the docs shouldn't gloss over things without making a callout to the appropriate reference doc that explains the topic of "proper secret handling")
I'm just going to go ahead and follow the instructions, assuming that part shouldn't be modified, because I guess that Weave GitOps has a hardcoded secret in it for dex, but I'm sure that's not true, if there's a doc that covers this part and I missed it, please whack me with a cluebat ๐ ๐