kcp-dev / helm-charts

Helm chart repo for KCP
Apache License 2.0
4 stars 21 forks source link

`admin.kubeconfig` x509: certificate signed by unknown authority #29

Closed mjudeikis closed 5 months ago

mjudeikis commented 1 year ago

I'm deploying KCP using a slightly modified chart version (https://github.com/mjudeikis/helm-charts/tree/alliasing). Most of the changes are indeed to workaround current limitations of the chart, like:

  1. Ability to alias DNS names
  2. Better control of ClusterIssuers for LE
  3. Ability to add initContainer for KCP to fix DigitalOcean storage limitations
  4. Ability to override DNS names in self-signed certs Will try to upstream delta later, but it should not change the issue itself, as those parts are not changed.

Values.yaml file looks like bellow. It uses Nginx Passthrough as Ingress (replicate OpenShift router passthrough), and DNS in-place resolver for LetsEncrypt.

externalHostname: "kcp.faros.sh"
kcp:
  hostAliases:
    enabled: true
    values:
    - hostnames:
      - kcp.faros.sh
      ip: 127.0.0.1
  volumeClassName: "do-block-storage-xfs"
  storagePermissionsInitContainer: true
  memoryLimit: 4Gi
  memoryRequest: 1Gi
  tokenAuth:
    enabled: true
    fileName: auth-token.csv
    config: |
        user-1-token,user-1,1111-1111-1111-1111,"team-1"
        xxxxxxxxxxxxx,admin,5555-5555-5555-5555,"system:kcp:admin"
kcpFrontProxy:
  openshiftRoute:
    enabled: false
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: "nginx"
      acme.cert-manager.io/http01-edit-in-place: "true"
      nginx.ingress.kubernetes.io/backend-protocol: HTTPS
      nginx.ingress.kubernetes.io/secure-backends: "true"
    secret: kcp-front-proxy-cert
  certificate:
    issuer: kcp-letsencrypt-prod
oidc:
  enabled: true
  issuerUrl: https://dex.faros.sh
  clientId: faros
  clientId: faros
  groupClaim: groups
  usernameClaim: email
  usernamePrefix: faros-sso-
  groupsPrefix: faros-sso-
certificates:
  dnsNames:
  - kcp
  - localhost
  - kcp.faros.sh
etcd:
  memoryLimit: 1Gi
  memoryRequest: 1Gi
  cpuRequest: 100m

It uses Lets Encrypt for FrontProxy kcp.faros.sh certificates. The current certificate flow is : FrontProxy (Lets Enrypt) -> Shard (KCP-CA self signed Cert-manager managed)

from inside KCP shard pod:

/data $ kubectl ws tree
Error: Get "https://kcp.faros.sh:443/clusters/root/apis/tenancy.kcp.io/v1alpha1/workspaces": x509: certificate signed by unknown authority

Because of URL and certificate-authority-data mismatch:

/data $ cat $KUBECONFIG                                                                                                                                                                                                                                                                                                                                                                                                                                          
apiVersion: v1                                                                                                                                                                                                                                                                                                                                                                                                                                                   
clusters:                                                                                                                                                                                                                                                                                                                                                                                                                                                        
- cluster:                                                                                                                                                                                                                                                                                                                                                                                                                                                       
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUREVENDQWZXZ0F3SUJBZ0lRRldnWm5EOFhmckdZaHNxTlZxRkNTREFOQmdrcWhraUc5dzBCQVFzRkFEQVIKTVE4d0RRWURWUVFERXdaclkzQXRZMkV3SGhjTk1qTXdNakl4TVRZd05UTXlXaGNOTWpRd01qSXhNVFl3TlRNeQpXakFBTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUF1ZzB2cG1SZjRvN3VqYm9LCjlORVlYbzRQZ09WT1BSQ0tTK0ZqZHpOL09DWldYRWtZcGVRU3NVMGxmdTFjcFlOT0lZVmZaOERDQTJSdXRWWnQKQ1kyNEN3aTJvVjNDVGlvb3A4Y2NHYldzU
mdBbHo4c2pISkp6eE9ZL09LaEVIc2VLMXRTZHlCclo0eElSc0xnMAptMVNlbGdMMXNnRG94bmxOdklrSE9vNlAyTlhMS3lHNGhOQWpBdk51Wmhlb2dmY2FqWUhnc2tySlJYOU90VmJxCkdnNEx3cVNrOU9zaDczeG0rZ3c3Tlgxb1NtZHlGZXI4d2VHbUIyb3NHUkRSbHF4eEovSU1jL3YwR2czRHV5alMKVU10VnhQUUtabUExMml6SEZNamcwODJ6MUIzM1BXcG1sVDB2cFlEaEdUSXRWa1lYRGdpeXp2Vk1VTGlnUmhCMwpRaFBjQVFJREFRQUJvM0l3Y0RBVEJnTlZIU1VFRERBS0JnZ3JCZ0VGQlFjREFUQU1CZ05WSFJNQkFmOEVBakFBCk1COEdBMVVkSXdRWU1CYUFGRHFzbEVXZVFKVi9LM1RFQTdGOH
BzKzgvdVZsTUNvR0ExVWRFUUVCL3dRZ01CNkMKQTJ0amNJSUpiRzlqWVd4b2IzTjBnZ3hyWTNBdVptRnliM011YzJnd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQgpBRExUMXMzYml3cUpJNlUzR2tEd2xIUVNTK3dFbk45WFc5NDNDbCsyMFVZemh4ME1lWHFaN3RobXd0cHdCalArCkVYMDJ4NFVQc0tPM0JaMnlPM2VmU2owQW1CT0Z1MDZOc3NsVFltQzlVaXpKZUZnL0dPbnZ3YVlWVVViV1NxVWYKVWhxUlROZlRQcEJTQis2cDI4czZCNDZrcUQwNnk0VEVpWFc4UllpU0FjWnV5WW9PVk53Z0wyL2EyZlJpM2VtSQpWMVZKVUFTU1l1NHYvQXBPWms2OUUvVnlVK29nQmxNT2p4QzFCZ0Z6c0xraUcyU1J
5d0FrcXR4TUJVN1BBdlFDCkVUMUIvVU5yZVVkeUNNL29XOHJ5aG1tTWFTYStEcjZlWm0vUVNGWjZJMDAwbXhxYlRhUTAwNnMzcDN5SFNIUnYKZWQ4YXdLYWYySmsxeXpQdGh4TlgycVk9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0KLS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFekNDQWZ1Z0F3SUJBZ0lRUWVXbElCRDBPQkNXQURaQklCK0oyekFOQmdrcWhraUc5dzBCQVFzRkFEQVYKTVJNd0VRWURWUVFERXdwclkzQXRjR3RwTFdOaE1CNFhEVEl6TURJeU1URXpORE14T1ZvWERUSXpNRFV5TWpFegpORE14T1Zvd0VURVBNQTBHQTFVRUF4TUdhMk53TFdOaE1JSUJJakFOQmdrcWhr
aUc5dzBCQVFFRkFBT0NBUThBCk1JSUJDZ0tDQVFFQThaU3ZVTWJEcU40QUt0VmZ6Y3pxOXppVmdjSGpaUitnRGViRXp3d0xZTVlIWndRMm5zek4KVnJPaThIUG50dDNLaTJwU0xNUlMxd2E3UjZzVjJHbjIxc2d1eDhSUVJLRHlhVGdQdEEwdUcwQmdSYVFxbFF5bgpVc3JrckthajRpWWRKajZ0elBaSTVEaGdGS2JCQ09jYzUzdDI2TTNFbmxsc3oyVE5sOWZuY2R3QWpmS2dTODBqCjBQMTFoVWhTZ2lVRnlodlNmdFdlb2MxNUIwOEt5VjJERlV6U2ZiNDkwZFdHOGpTaUZJZXk0b1BxNTc3OUdETGoKZE5OMlNRSHhpTjh4N0h2QThTS1hrcHlWN2Z6K3UybnpMcHVUQkJqYUNqY0ZWaFNWenZtaGRVTXZKT
GVPQmE0WQpvSFp3MEhOaW9sWHpnNUMvNFg2YTVWUWVJNHNPYmd1K2RRSURBUUFCbzJNd1lUQU9CZ05WSFE4QkFmOEVCQU1DCkFxUXdEd1lEVlIwVEFRSC9CQVV3QXdFQi96QWRCZ05WSFE0RUZnUVVPcXlVUlo1QWxYOHJkTVFEc1h5bXo3eisKNVdVd0h3WURWUjBqQkJnd0ZvQVV1TTJOWmRDb0l5WnRqVGlxUk9yem5tT3I2WkF3RFFZSktvWklodmNOQVFFTApCUUFEZ2dFQkFKMGZtMVY5YWcxZWpuRVFVOUhhSmNleTNIVzNRSFV2dXRKMGNkSmx5UEJ1aHJnZDUzdUlnTmI2Ck5EcXVLaXBvTDAzOVllcjZ4OG5sdTNzci9kbkNVQm9DcVo0RkF4R2diZlpVOUlHLzNUTWZFYmxlZkM3T0RUdU0KOUt0Um
NQZnAzY3QrRFp2UXpnczJFZCswZlpQMExwaVcxREFrN0ZMbkNqKzdmdW54Ly80Mmp6WlVzSCtlMTlwSQo4UnA3d3NEN3NYUERaRzU1eXdiZHIzVzZKS28zWTYxRTVRNXA1bDFueEJqWG1QR1Q0WXRsdlR5bDZqbHg5WjEyClVmaGU1RHptMmpUcXlKVDlDVkZSQzlxalRTcUVZNm9KVU9CeTdzSmxWWGtQVEZmSnhidFVxcGY1bDIrUGZLWDUKTXYwRWVBbFovMFJhNlJ1MHRVa1VXV0FTQUVOODVLUT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=                                                                                                                   
    server: https://kcp.faros.sh:443 

admin.kubeconfig is generated with FrontProxy URL and Shard certificate, and this breaks the Certificate trust chain as KCP-CA is not in the FrontProxy CA trust.

For this to work admin.kubeconfig of the shard should be pointing to shard itself, and go via FrontProxy OR it should have Let's Encrypt (or other CA authority used in FP deployment) in certificate-authority-data.

Workaround:

  1. Create a second KCP service object to map port 443 to 6443. Port must be 443 so DNS override would work.
    k get svc -n kcp kcp-internal -o yaml                                                                                                                                                                                                                                                                                                                                                                                                  
    apiVersion: v1                                                                                                                                                                                                                                                                                                                                                                                                                                                   
    kind: Service                                                                                                                                                                                                                                                                                                                                                                                                                                                    
    metadata:                                  
    labels:                                  
    app: kcp                               
    name: kcp-internal                       
    namespace: kcp                           
    spec:                           
    ports:                    
    - name: kcp               
    port: 443               
    protocol: TCP           
    targetPort: 6443        
    - name: virtual-workspaces
    port: 444          
    protocol: TCP      
    targetPort: 6444   
    selector:               
    app: kcp              
    sessionAffinity: None   
    type: ClusterIP         
    status:                   
    loadBalancer: {}   
  2. Create DNS alias for KCP pod in the pod spec:
    hostAliases:
    - hostnames:
    - kcp.faros.sh
    ip: 10.245.126.57 (Service IP address)

With this traffic inteded to kcp.faros.sh frontProxy goes directrly to shard endpoint, and things just work. But this is workaround around the fact that the admin.kubeconfig certificate and URL do not match.

hardys commented 1 year ago

I think the issue here is admin.kubeconfig is generated for every shard by the KCP server - now that we're doing sharded deployments that won't make sense, because we'll end up with N admin.kubeconfig files each with a shard specific kcp-admin bearer token.

I suspect the long term fix here is to remove the kubeconfig generation from KCP completely, but as a stopgap we may be able to have that point to the shard BaseURL instead of ExternalAddress.

It's currently possible to manually create a kubeconfig with the front-proxy CA and credentials of your choice (e.g client-cert, token, OIDC etc), but we don't currently document how to do that - we could add that to the README in this repo, or another alternative would be to create an additional kubeconfig via the chart (although in that case it should be optional and default to off, since in most non-dev cases the admin client won't run in the deployed KCP containers)

sttts commented 1 year ago

I suspect the long term fix here is to remove the kubeconfig generation from KCP completely, but as a stopgap we may be able to have that point to the shard BaseURL instead of ExternalAddress.

We won't delete that. Those kubeconfigs are meant as shard configs, and have system:master membership. They are just not supposed to be exposed beyond one shard. They are intentionally not working with the proxy either.

Creating an admin kubeconfig out of band is the right approach IMO. There is only so much you can do in-place from within a chart.

mjudeikis commented 1 year ago

They are just not supposed to be exposed beyond one shard

But they are currently pointing to frontProxy. Should we start exposing shards as public URLS: shard1.eu.faros.sh shard2.eu.faros.sh and creating kubeconfig for those? These would not be frontPrpxy routes but more as public endpoints for global FrontProxy?

I don't see "global scale" possible with all shards being in the same network? Or idea is that all shards are in same private network of some sort?

sttts commented 1 year ago

There should not be a need to talk to shards directly, with the exception of the virtual workspace URL (part of Shard.spec; can be behind some load balancer). Hence, I would not encourage exposing shards directly.

Or idea is that all shards are in same private network of some sort?

Yes, that's the idea.

hardys commented 1 year ago

We won't delete that. Those kubeconfigs are meant as shard configs, and have system:master membership. They are just not supposed to be exposed beyond one shard. They are intentionally not working with the proxy either.

Ack, I was thinking that in a production environment you'd want to avoid non-expiring admin bearer tokens (even if they are only for internal use), and instead generate client certs which can be managed outside of KCP. I guess we can make the internal bearer-token stuff optional in future if needed.

Creating an admin kubeconfig out of band is the right approach IMO. There is only so much you can do in-place from within a chart.

I can work on a PR which shows how to do that in the README.

mjudeikis commented 5 months ago

This is more of certificate bundles now. Need extend cert bundles to be more usable so users can add their own certs (Letencrypt including ) if they want to

mjudeikis commented 5 months ago

will raise separate issue and close this one