cloudfoundry / cf-crd-explorations

Apache License 2.0
3 stars 2 forks source link

Explore: check OIDC support in AKS and EKS #45

Closed gcapizzi closed 3 years ago

gcapizzi commented 3 years ago

Background

As we're betting on OpenID Connect (OIDC) as our authentication method, we want to make sure it is supported by all major vendors. This is not the case for Google Kubernetes Engine (GKE), but it should be for Microsoft Azure Kubernetes Service (AKS) and Amazon Elastic Kubernetes Service (EKS).

Questions

gcapizzi commented 3 years ago

Amazon Elastic Kubernetes Service (EKS)

OpenID Connect seems very well supported. Providers can easily be associated to existing clusters. Also both IAM and Cognito can federate external OIDC connect providers.

Notably, it looks like Amazon does not sell an OIDC provider itself.

gcapizzi commented 3 years ago

Microsoft Azure Kubernetes Service (AKS)

It looks like the only integration available is with Azure AD (Active Directory), which supports federation.

Microsoft also distribute a kubectl plugin called kubelogin (not the same as this kubelogin) which implements a bunch of weird (I suppose Active Directory-specific) flows.

gcapizzi commented 3 years ago

Reopening as I think it would be valuable to actually try our oidc-login plugin (which now supports all flows) against properly configured EKS and AKS clusters.

gcapizzi commented 3 years ago

Blocking until we can get hold of an Amazon and an Azure account.

gcapizzi commented 3 years ago

Amazon EKS

I was able to create an EKS cluster, configure it to use Google as its OpenID Connect identity provider and login using kubelogin.

I've been struggling to add nodes to the cluster in order to install the CF CRDs and test cf-shim, but I believe this is enough of a proof that EKS has working OIDC support.

gcapizzi commented 3 years ago

Microsoft AKS, part I

I was able to create an AKS cluster authenticated via Azure AD (AAD).

AAD seems to be the only authentication option for AKS clusters, and it should be OIDC compliant. Federation is available but only for SAML apparently, not for third-party OIDC providers.

I haven't been able to login to Azure with a standard OIDC flow. The way they recommend to login is via the az CLI, with a command like this:

az aks get-credentials --resource-group cf-on-k8s-test --name cf-on-k8s-test

This does something I had never seen before: it adds a user to $KUBECONFIG, but without any tokens. Instead, it adds a auth-provider field that looks like this:

apiVersion: v1
  [...]
  users:
  - name: clusterUser_cf-on-k8s-test_cf-on-k8s-test
    user:
      auth-provider:
        config:
          apiserver-id: XXX
          client-id: XXX
          config-mode: '1'

Then, once I try to run a kubectl command, I somehow get prompted this:

❯ kubectl get nodes
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code XXX to authenticate.

I am not sure how this works, but after following the instructions auth-provider.config gets the access-token and refresh-token fields. Edit: this behaviour is literally baked into client-go and thus kubectl!

gcapizzi commented 3 years ago

Microsoft AKS, part II

I have managed to get ID tokens via both cf oidc-login and the OpenID Connect Playground! Bad news is they don't seem to work 😕

❯ kubectl --token "XXX" get nodes
error: You must be logged in to the server (Unauthorized)

Using the access-token produced by az aks get-credentials works fine:

❯ kubectl --token "YYY" get nodes
aks-nodepool1-21605491-vmss000000   Ready    agent   112m   v1.19.11
aks-nodepool1-21605491-vmss000001   Ready    agent   112m   v1.19.11
aks-nodepool1-21605491-vmss000002   Ready    agent   112m   v1.19.11

Decoding the two tokens, it looks like the one produced by az is much richer. The payload looks like this:

{
  "aud": "xxx",
  "iss": "https://sts.windows.net/XXX/",
  "iat": 123,
  "nbf": 123,
  "exp": 123,
  "acr": "1",
  "aio": "XXX",
  "altsecid": "XXX",
  "amr": [
    "pwd"
  ],
  "appid": "XXX",
  "appidacr": "0",
  "email": "gcapizzi@vmware.com",
  "family_name": "Capizzi",
  "given_name": "gcapizzi@vmware.com",
  "groups": [
    "XXX",
    "XXX"
  ],
  "idp": "https://sts.windows.net/XXX/",
  "ipaddr": "34.142.48.29",
  "name": "gcapizzi@vmware.com Capizzi",
  "oid": "XXX",
  "puid": "XXX",
  "rh": "XXX",
  "scp": "user.read",
  "sub": "XXX",
  "tid": "XXX",
  "unique_name": "gcapizzi@vmware.com",
  "uti": "XXX",
  "ver": "1.0",
  "wids": [
    "XXX",
    "XXX"
  ]
}

This is what the cf oidc-login token payload looks like instead:

{
  "aud": "XXX",
  "iss": "https://sts.windows.net/XXX/",
  "iat": 123,
  "nbf": 123,
  "exp": 123,
  "amr": [
    "pwd"
  ],
  "email": "gcapizzi@vmware.com",
  "family_name": "Capizzi",
  "given_name": "gcapizzi@vmware.com",
  "idp": "https://sts.windows.net/XXX/",
  "ipaddr": "151.55.16.85",
  "name": "gcapizzi@vmware.com Capizzi",
  "oid": "XXX",
  "rh": "XXX",
  "sub": "XXX",
  "tid": "XXX",
  "unique_name": "gcapizzi@vmware.com",
  "uti": "XXX",
  "ver": "1.0"
}

The following fields are missing in our token:

The values of fields we have in common look relatively similar. Maybe we just need to ask for more claims? None of the above are explicitly mentioned in the spec, except for acr.

gcapizzi commented 3 years ago

Microsoft AKS, part III

I have tried adding the following value of claims to the authentication URL:

{
  "id_token": {
    "acr": null,
    "aio": null,
    "altsecid": null,
    "appid": null,
    "appidacr": null,
    "groups": null,
    "puid": null,
    "scp": null,
    "wids": null
  }
}

The authentication URL looks like this:

https://login.windows.net/XXX/oauth2/authorize?claims=%7B%22id_token%22%3A%7B%22acr%22%3Anull%2C%22aio%22%3Anull%2C%22altsecid%22%3Anull%2C%22appid%22%3Anull%2C%22appidacr%22%3Anull%2C%22groups%22%3Anull%2C%22puid%22%3Anull%2C%22scp%22%3Anull%2C%22wids%22%3Anull%7D%7D&client_id=XXX&redirect_uri=http%3A%2F%2Flocalhost%3A5555%2Fcallback&response_type=code&scope=openid+profile+email&state=I+wish+to+wash+my+irish+wristwatch

Still no luck. 😕 This alone sounds like a good reason to support a way for the cf CLI to borrow its tokens from the $KUBECONFIG. Something like:

cf get-kube-tokens --user oidc

which goes and grabs the tokens under users[].user.auth-provider.config.(access|refresh)-token. It could also make sense to support --(auth|refresh)-token options a bit like kubectl does:

cf oidc-login --auth-token="XXX" --refresh-token="YYY"

Let's not forget that az aks get-credentials doesn't directly inject token in $KUBECONFIG though! The flow would have to look something like this:

az aks get-credentials [...]
kubectl get nodes # actually gets the user through the authentication flow
cf get-kube-tokens
gcapizzi commented 3 years ago

Microsoft AKS, part IV

I believe the problem with the tokens we are generating are due to the application I have created not being tied to the cluster I have created.

It looks like, by default, AKS cluster get managed AAD. With managed AAD, Azure automatically creates and manages a client application associated with the cluster and invisible to the end user. The client ID of this application is not visible, so there is no way to construct a correct authorization request.

I tried to use these instructions to create a separate identity to associate with my cluster, but I wasn't able to configure basic settings like the redirect URL, so I must be missing something. Also, the clientSecretUrl would return an error when visited.

This document suggests that Managed identities are a kind of Service principals that "eliminate the need for developers to manage credentials" and "can be granted access and permissions, but cannot be updated or modified directly". Another type of Service principal is Applications, which I've played with already and provide all the familiar settings, including redirect URLs. I think my last hope is trying to create a cluster that uses an Application as its Service principal instead of a Managed identity, following this document.

gcapizzi commented 3 years ago

Microsoft AKS, part V

I created a cluster associated to an Application, but still no luck. Even if this worked, it definitely wouldn't have been the default setup.

This draws me to the conclusion that, while we might rely on all our clusters being authenticated via OIDC (or maybe even OAuth 2.0?), we can't rely on all Identity Providers to support standard authentication flows. AKS is one of these cases: authentication happens via OIDC, the tokens are OIDC tokens, but the way users get their token has nothing to do with the standard flows.

One thing that all these providers do have in common is that they produce a valid $KUBECONFIG. The range of configurations is wide though: some directly put the tokens there, some put client certificates, some use client-go credential plugins, and some use something else entirely!

We should explore all these configurations and evaluate how feasible it is to leverage $KUBECONFIG. I couldn't find any documentation about all the authentication configuration options available, which makes this even more of a challenge.