siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.87k stars 553 forks source link

Konnectivity server & agent #9395

Open maxpain opened 1 month ago

maxpain commented 1 month ago

Hello.

It would be cool to have built-in support for the Konnectivity agent and server in Talos Linux. Use case: I deployed my cluster on Hetzner, but my control plane nodes are on Cloud VMs, and the worker nodes are on Bare Metal. I installed Cilium with native routing without encapsulation since I have an L2 network between bare metal nodes; however, the control plane nodes are in a different network. I don't want to install Cilium or any CNI on the control plane nodes, but kube-apiserver requires access to the cluster network (admission webhooks, etc.), and I have to use something like "Konnectivity".

image

Or is there some elegant way to implement this in Talos Linux without Konnectivity?

maxpain commented 1 month ago

This could also potentially simplify "Kubernetes as a Service" implementations, in which control plane nodes are completely hidden and managed (like GKE).

maxpain commented 1 month ago

I was also wondering if KubeSpan could be helpful here. Maybe we can just route service CIDR from the control plane to the worker nodes via wireguard?

maxpain commented 1 month ago

Related

https://github.com/cilium/cilium/issues/22898 https://github.com/cilium/cilium/issues/32810

smira commented 1 month ago

I was also wondering if KubeSpan could be helpful here. Maybe we can just route service CIDR from the control plane to the worker nodes via wireguard?

KubeSpan today work on top of node IPs, so the CNI does pod -> node translation.

KubeSpan has a not enabled by default setting to route pod IPs, but that is not enabled by default.

KubeSpan never worked for Service IPs, as that requires some form of kube-proxy which would translate service IPs to pod IPs.

maxpain commented 1 month ago

KubeSpan never worked for Service IPs, as that requires some form of kube-proxy which would translate service IPs to pod IPs.

It should be done on the worker node side since CNI is installed there. Theoretically, we just need to route the service cidr from the control plane nodes to the worker nodes somehow..

rothgar commented 1 month ago

Does Konnectivity route the service ips properly? In a hosted platform (EKS, GKE) I'm not sure if admission webhooks run on the control plane nodes. I think they usually run at some other endpoint (eg VPC endpoint) all of the nodes in the network can reach.

Would SideroLink with Omni solve this problem? I don't think so but don't know the full details of how k8s services interact with SideroLink

gecube commented 1 month ago

@rothgar

Does Konnectivity route the service ips properly?

generally speaking (in very similar config) - yes, it routes correctly.

maxpain commented 1 month ago

I personally replaced Konnectivity with Tailscale, which is better, but If someone reads this and still needs Konnectivity, I managed to deploy it on Talos:

Controlplane machineconfig patch, we run konnectivity-server as static pods on controlplane nodes:

cluster:
  proxy:
    disabled: true

  apiServer:
    extraArgs:
      egress-selector-config-file: /var/lib/egress-selector.yaml

    extraVolumes:
      - hostPath: /var/lib/egress-selector.yaml
        mountPath: /var/lib/egress-selector.yaml
        readonly: true
      - hostPath: /etc/kubernetes/konnectivity-server
        mountPath: /etc/kubernetes/konnectivity-server
        readonly: false

machine:
  files:
    - content: |
        apiVersion: apiserver.k8s.io/v1beta1
        kind: EgressSelectorConfiguration
        egressSelections:
        - name: cluster
          connection:
            proxyProtocol: GRPC
            transport:
              uds:
                udsName: /etc/kubernetes/konnectivity-server/konnectivity-server.socket
      path: /var/lib/egress-selector.yaml
      permissions: 0o444
      op: create

  pods:
    - apiVersion: v1
      kind: Pod
      metadata:
        name: konnectivity-server
      spec:
        priorityClassName: system-cluster-critical
        securityContext:
          runAsGroup: 65534
          runAsNonRoot: false
          runAsUser: 0
          fsGroup: 0
        hostNetwork: true
        initContainers:
          - name: init
            image: busybox
            command:
              - "sh"
              - "-c"
              - "touch /etc/kubernetes/konnectivity-server/konnectivity-server.socket && chmod -R 777 /etc/kubernetes/konnectivity-server"
            volumeMounts:
              - name: konnectivity-uds
                mountPath: /etc/kubernetes/konnectivity-server
        containers:
          - name: konnectivity-server-container
            image: registry.k8s.io/kas-network-proxy/proxy-server:v0.30.3
            command: ["/proxy-server"]
            args: [
                "--logtostderr=true",
                "--uds-name=/etc/kubernetes/konnectivity-server/konnectivity-server.socket",
                # "--delete-existing-uds-file",
                "--cluster-cert=/system/secrets/kubernetes/kube-apiserver/apiserver.crt",
                "--cluster-key=/system/secrets/kubernetes/kube-apiserver/apiserver.key",
                "--mode=grpc",
                "--server-port=0",
                "--agent-port=8132",
                "--admin-port=8133",
                "--health-port=8134",
                "--agent-namespace=kube-system",
                "--agent-service-account=konnectivity-agent",
                "--kubeconfig=/system/secrets/kubernetes/kube-scheduler/kubeconfig",
                "--authentication-audience=system:konnectivity-server",
              ]
            # There is no CNI on the control plane nodes, so we can use localhost.
            env:
              - name: KUBERNETES_SERVICE_HOST
                value: localhost
              - name: KUBERNETES_SERVICE_PORT
                value: "7445"
            livenessProbe:
              httpGet:
                scheme: HTTP
                host: 127.0.0.1
                port: 8134
                path: /healthz
              initialDelaySeconds: 30
              timeoutSeconds: 60
            ports:
              - name: agentport
                containerPort: 8132
                hostPort: 8132
              - name: adminport
                containerPort: 8133
                hostPort: 8133
              - name: healthport
                containerPort: 8134
                hostPort: 8134
            volumeMounts:
              - name: secrets
                mountPath: /system/secrets/kubernetes
                readOnly: true
              - name: konnectivity-uds
                mountPath: /etc/kubernetes/konnectivity-server
                readOnly: false
        volumes:
          - name: secrets
            hostPath:
              path: /system/secrets/kubernetes
          - name: konnectivity-uds
            hostPath:
              path: /etc/kubernetes/konnectivity-server
              type: DirectoryOrCreate

konnectivity-agent deployment to be deployed on the worker nodes:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    k8s-app: konnectivity-agent
  namespace: kube-system
  name: konnectivity-agent
spec:
  selector:
    matchLabels:
      k8s-app: konnectivity-agent
  template:
    metadata:
      labels:
        k8s-app: konnectivity-agent
    spec:
      hostNetwork: true
      priorityClassName: system-cluster-critical
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/control-plane
                    operator: DoesNotExist
      containers:
        - image: registry.k8s.io/kas-network-proxy/proxy-agent:v0.30.3
          name: konnectivity-agent
          command: ["/proxy-agent"]
          args: [
              "--logtostderr=true",
              "--ca-cert=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt",
              "--proxy-server-host=10.200.0.5", # I used Hetzner Cloud Load Balancer, which proxies requests to the Konnectivity server
              "--proxy-server-port=8132",
              "--admin-server-port=8133",
              "--health-server-port=8134",
              "--service-account-token-path=/var/run/secrets/tokens/konnectivity-agent-token",
            ]
          volumeMounts:
            - mountPath: /var/run/secrets/tokens
              name: konnectivity-agent-token
          livenessProbe:
            httpGet:
              port: 8134
              path: /healthz
            initialDelaySeconds: 15
            timeoutSeconds: 15
      serviceAccountName: konnectivity-agent
      volumes:
        - name: konnectivity-agent-token
          projected:
            sources:
              - serviceAccountToken:
                  path: konnectivity-agent-token
                  audience: system:konnectivity-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:konnectivity-server
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: system:konnectivity-server
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: konnectivity-agent
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rothgar commented 1 month ago

Thanks for sharing. Was there a reason you switched to tailscale instead of using Konnectivity? Would you mind adding these examples to our documentation in case someone wants to use it in the future?

maxpain commented 4 weeks ago

Was there a reason you switched to tailscale instead of using Konnectivity?

Yes. In my case, the Control Plane nodes (Hetzner Cloud) still could reach the Kubelets on the Worker nodes (Hetzner Bare Metal) via Hetzner vSwitch. The problem is that these are two different L2 subnets with a gateway between them, so the control plane nodes couldn't reach the Pod and Service networks on the vSwitch side without hitting that gateway (I use Cilium Native Routing, so the network has to be flat).

So I needed something to help Control Plane Nodes reach only pod/service subnets, not kubelets.

Tailscale is the perfect solution because it just routes those subnets using P2P mesh without hassle with manifests/certificates as if using Konnectivity (I used kube-scheduler kubeconfig and cert/keys from kube-apiserver as a workaround because of this). You also need a Load Balancer on the Konnectivity-server side.

Also, sometimes, I got "Agent is not available" errors from Konnectivity-server when trying to view the pod logs.

maxpain commented 4 weeks ago

Would you mind adding these examples to our documentation in case someone wants to use it in the future?

What we can do with those workarounds with kubeconfig, certs and keys?

rothgar commented 3 weeks ago

Until we have a way to natively handle those secrets in Talos it probably doesn't make sense to write docs for it. The manual way of writing secrets to disk and mounting them is not something we want to recommend.