k0sproject / k0sctl

A bootstrapping and management tool for k0s clusters.
Other
375 stars 77 forks source link

Control Plane High Availability leads to kube-router CrashLoopBackOff #767

Open gilly42 opened 4 days ago

gilly42 commented 4 days ago

Hello,

I have the following problem, maybe I'm doing something wrong.

When I setup a HA proxy (debian12|2.6.12-1+deb12u1) and a k0s cluster (debian12|v1.30.4+k0s via k0sctl) as described in Control Plane High Availability and run it without adding the 'externalAddress' everything works great, I can access the control plane(s) via ha proxy, all good.

As soon as I add an 'externalAddress' in the config, the newly created kube-routers start to go into a CrashLoopBackOff. According to the logs, they can no longer access 10.96.0.1:443

...Failed to watch *v1.Pod: failed to list *v1.Pod: Get “https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection refused

As far as I understand nft list ruleset correctly, adding the 'externalAddress' removes some rules that would otherwise manage the traffic to 10.96.0.1:443

I assume this is the reason why the kube-router does not work anymore. My question, is this a bug or am I doing something wrong, I don't see any further information in the manual.

ruleset_with_external.txt ruleset_without_external.txt

twz123 commented 2 days ago

Hi! Could you please share your k0sctl config file? Also, how did you setup the external load balancer for the externalAddress?

gilly42 commented 2 days ago

/Sure, I testet different systems and environments. To debug I just took some fresh vms from hetzner (which I throw away after). On Hetzner I took debian 12 systems, an other systems I got ubuntu 22.04

(the ruleset files from above are from the ubuntu system, but they the same on the debian systems)

k0sctl.yml

apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
  name: k0s-cluster
spec:
  hosts:
  - ssh:
      address: 188.245.164.154
      user: root
      port: 22
      keyPath: ~/.ssh/ctl
    role: controller
  - ssh:
      address: 116.202.26.102
      user: root
      port: 22
      keyPath: ~/.ssh/ctl
    role: controller
  - ssh:
      address: 91.107.193.216
      user: root
      port: 22
      keyPath: ~/.ssh/ctl
    role: controller
  - ssh:
      address: 188.245.165.94
      user: root
      port: 22
      keyPath: ~/.ssh/ctl
    role: worker
  - ssh:
      address: 88.198.150.114
      user: root
      port: 22
      keyPath: ~/.ssh/ctl
    role: worker
  k0s:
    config:
      spec:
        api:
          externalAddress: 188.245.165.100
          sans:
            - 188.245.165.100

HA Proxy installation (on debian12 systems) I testet also version 3 on the ubuntu systems - same behavior)

apt-get update
apt-get install haproxy=2.6.\*

haproxy config:

global
  log /dev/log  local0
  log /dev/log  local1 notice
  chroot /var/lib/haproxy
  stats socket /run/haproxy/admin.sock mode 660 level admin
  stats timeout 30s
  user haproxy
  group haproxy
  daemon

  # Default SSL material locations
  ca-base /etc/ssl/certs
  crt-base /etc/ssl/private

  # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
  ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
  ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
  ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
  log   global
  mode  http
  option    httplog
  option    dontlognull
  timeout connect 5000
  timeout client  50000
  timeout server  50000
  errorfile 400 /etc/haproxy/errors/400.http
  errorfile 403 /etc/haproxy/errors/403.http
  errorfile 408 /etc/haproxy/errors/408.http
  errorfile 500 /etc/haproxy/errors/500.http
  errorfile 502 /etc/haproxy/errors/502.http
  errorfile 503 /etc/haproxy/errors/503.http
  errorfile 504 /etc/haproxy/errors/504.http

frontend kubeAPI
    bind :6443
    mode tcp
    default_backend kubeAPI_backend

frontend konnectivity
    bind :8132
    mode tcp
    default_backend konnectivity_backend

frontend controllerJoinAPI
    bind :9443
    mode tcp
    default_backend controllerJoinAPI_backend

backend kubeAPI_backend
    mode tcp
    server k0s-controller1 188.245.164.154:6443 check check-ssl verify none
    server k0s-controller2 116.202.26.102:6443 check check-ssl verify none
    server k0s-controller3 91.107.193.216:6443 check check-ssl verify none

backend konnectivity_backend
    mode tcp
    server k0s-controller1 188.245.164.154:8132 check check-ssl verify none
    server k0s-controller2 116.202.26.102:8132 check check-ssl verify none
    server k0s-controller3 91.107.193.216:8132 check check-ssl verify none

backend controllerJoinAPI_backend
    mode tcp
    server k0s-controller1 188.245.164.154:9443 check check-ssl verify none
    server k0s-controller2 116.202.26.102:9443 check check-ssl verify none
    server k0s-controller3 91.107.193.216:9443 check check-ssl verify none

listen stats
   bind *:9000
   mode http
   stats enable
   stats uri /

And I also tried calicio as cni provider with more or less the same result, I think the change on the iptables are the problem, you have any Idea?

I playing with the Idea to just add the HA IP to the SAN section, so without the externalAdress. This seems to work and etcd got the other CPs as members. Is there any advantage to set the externalAdress?

jnummelin commented 1 day ago

dial tcp 10.96.0.1:443: connect: connection refused

This pretty much implies that kube-proxy has not been able to properly create the kubernetes.default svc rules. So couple things to check:

The major benefit of using externalAddress is that k0s configures all the needed system components (kubelets, kube-proxy, ...) to connect to that address. Thus you get failover capability. In case it is not set, and using k0sctl, all those component would connect to the api only via one of the hosts, so if one goes down, all those components go down

jnummelin commented 1 day ago

One thing worth noting is that k0s also has a feature called NLLB that creates a local LB (of sorts) on all the workers and configures kubelet, kube-proxy etc to connect to API via it --> failover capability without having to setup HAProxy. Of course this does NOT solve the external, i.e. user, access failover

gilly42 commented 1 day ago

The endpoint and the kubeconfig.conf both have the HA IP on port 6443, this looks right for me. On the first control plane I got this:

k0s kc get svc,ep kubernetes -o yaml

root@k0s-cp:~# k0s kc get svc,ep kubernetes -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: "2024-09-27T13:10:53Z"
    labels:
      component: apiserver
      provider: kubernetes
    name: kubernetes
    namespace: default
    resourceVersion: "195"
    uid: 27a2d483-4df7-4f99-9561-f6d8903430cb
  spec:
    clusterIP: 10.96.0.1
    clusterIPs:
    - 10.96.0.1
    internalTrafficPolicy: Cluster
    ipFamilies:
    - IPv4
    ipFamilyPolicy: SingleStack
    ports:
    - name: https
      port: 443
      protocol: TCP
      targetPort: 6443
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}
- apiVersion: v1
  kind: Endpoints
  metadata:
    creationTimestamp: "2024-09-27T13:10:53Z"
    labels:
      endpointslice.kubernetes.io/skip-mirror: "true"
    name: kubernetes
    namespace: default
    resourceVersion: "12255"
    uid: 09b6c97f-860f-418d-9d56-43f4960269f2
  subsets:
  - addresses:
    - ip: 188.245.165.100
    ports:
    - name: https
      port: 6443
      protocol: TCP
kind: List
metadata:
  resourceVersion: ""

k0s kc -n kube-system get cm kube-proxy -o yaml

apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 0
      contentType: ""
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 0
    clusterCIDR: 10.244.0.0/16
    configSyncPeriod: 0s
    featureGates:
    mode: "iptables"
    conntrack:
      maxPerCore: 0
      min: null
      tcpCloseWaitTimeout: null
      tcpEstablishedTimeout: null
    detectLocalMode: ""
    enableProfiling: false
    healthzBindAddress: ""
    hostnameOverride: ""
    iptables: {"syncPeriod":"0s","minSyncPeriod":"0s"}
    ipvs: {"syncPeriod":"0s","minSyncPeriod":"0s","tcpTimeout":"0s","tcpFinTimeout":"0s","udpTimeout":"0s"}
    kind: KubeProxyConfiguration
    metricsBindAddress: 0.0.0.0:10249
    nodePortAddresses: null
    oomScoreAdj: null
    portRange: ""
    showHiddenMetricsForVersion: ""
    udpIdleTimeout: 0s
    winkernel:
      enableDSR: false
      networkName: ""
      sourceVip: ""
  kubeconfig.conf: |-
    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://188.245.165.100:6443
      name: default
    contexts:
    - context:
        cluster: default
        namespace: default
        user: default
      name: default
    current-context: default
    users:
    - name: default
      user:
        tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
  annotations:
    k0s.k0sproject.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"config.conf":"apiVersion: kubeproxy.config.k8s.io/v1alpha1\nbindAddress: 0.0.0.0\nclientConnection:\n  acceptContentTypes: \"\"\n  burst: 0\n  contentType: \"\"\n  kubeconfig: /var/lib/kube-proxy/kubeconfig.conf\n  qps: 0\nclusterCIDR: 10.244.0.0/16\nconfigSyncPeriod: 0s\nfeatureGates:\nmode: \"iptables\"\nconntrack:\n  maxPerCore: 0\n  min: null\n  tcpCloseWaitTimeout: null\n  tcpEstablishedTimeout: null\ndetectLocalMode: \"\"\nenableProfiling: false\nhealthzBindAddress: \"\"\nhostnameOverride: \"\"\niptables: {\"syncPeriod\":\"0s\",\"minSyncPeriod\":\"0s\"}\nipvs: {\"syncPeriod\":\"0s\",\"minSyncPeriod\":\"0s\",\"tcpTimeout\":\"0s\",\"tcpFinTimeout\":\"0s\",\"udpTimeout\":\"0s\"}\nkind: KubeProxyConfiguration\nmetricsBindAddress: 0.0.0.0:10249\nnodePortAddresses: null\noomScoreAdj: null\nportRange: \"\"\nshowHiddenMetricsForVersion: \"\"\nudpIdleTimeout: 0s\nwinkernel:\n  enableDSR: false\n  networkName: \"\"\n  sourceVip: \"\"","kubeconfig.conf":"apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n    server: https://188.245.165.100:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    namespace: default\n    user: default\n  name: default\ncurrent-context: default\nusers:\n- name: default\n  user:\n    tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token"},"kind":"ConfigMap","metadata":{"labels":{"app":"kube-proxy"},"name":"kube-proxy","namespace":"kube-system"}}
    k0s.k0sproject.io/stack-checksum: 37c3ad792a0c4d89ee234cb32f5cf3f1
  creationTimestamp: "2024-09-27T13:11:05Z"
  labels:
    app: kube-proxy
    k0s.k0sproject.io/stack: kubeproxy
  name: kube-proxy
  namespace: kube-system
  resourceVersion: "10986"
  uid: ab2e2a08-397f-4886-8567-a5b1dad8ecb4

If I add the externalAddress the kube-router doesn't re-start, only the konnectivity-agent, coredns and kube-proxy restarting, so the logs from the kube-proxy are still the ones without extnalAddress.

gilly42 commented 1 day ago

kube-proxy log on a fresh cluster with externalAddress looks like this

2024-09-27T13:47:35.473478688Z stderr F I0927 13:47:35.473340       1 server.go:511] "Using lenient decoding as strict decoding failed" err="strict decoding error: unknown field \"udpIdleTimeout\""
2024-09-27T13:47:35.485984293Z stderr F I0927 13:47:35.485860       1 server.go:1062] "Successfully retrieved node IP(s)" IPs=["188.245.165.94"]
2024-09-27T13:47:35.499843183Z stderr F I0927 13:47:35.499701       1 server.go:659] "kube-proxy running in dual-stack mode" primary ipFamily="IPv4"
2024-09-27T13:47:35.499860856Z stderr F I0927 13:47:35.499742       1 server_linux.go:165] "Using iptables Proxier"
2024-09-27T13:47:35.501936595Z stderr F I0927 13:47:35.501818       1 server_linux.go:511] "Detect-local-mode set to ClusterCIDR, but no cluster CIDR for family" ipFamily="IPv6"
2024-09-27T13:47:35.501945972Z stderr F I0927 13:47:35.501833       1 server_linux.go:528] "Defaulting to no-op detect-local"
2024-09-27T13:47:35.50194971Z stderr F I0927 13:47:35.501848       1 proxier.go:243] "Setting route_localnet=1 to allow node-ports on localhost; to change this either disable iptables.localhostNodePorts (--iptables-localhost-nodeports) or set nodePortAddresses (--nodeport-addresses) to filter loopback addresses"
2024-09-27T13:47:35.502087148Z stderr F I0927 13:47:35.501972       1 server.go:872] "Version info" version="v1.30.4"
2024-09-27T13:47:35.502106103Z stderr F I0927 13:47:35.501986       1 server.go:874] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
2024-09-27T13:47:35.502806886Z stderr F I0927 13:47:35.502724       1 config.go:192] "Starting service config controller"
2024-09-27T13:47:35.502820672Z stderr F I0927 13:47:35.502740       1 shared_informer.go:313] Waiting for caches to sync for service config
2024-09-27T13:47:35.50282518Z stderr F I0927 13:47:35.502758       1 config.go:101] "Starting endpoint slice config controller"
2024-09-27T13:47:35.502829508Z stderr F I0927 13:47:35.502761       1 shared_informer.go:313] Waiting for caches to sync for endpoint slice config
2024-09-27T13:47:35.5031876Z stderr F I0927 13:47:35.503112       1 config.go:319] "Starting node config controller"
2024-09-27T13:47:35.50319832Z stderr F I0927 13:47:35.503122       1 shared_informer.go:313] Waiting for caches to sync for node config
2024-09-27T13:47:35.603810207Z stderr F I0927 13:47:35.603684       1 shared_informer.go:320] Caches are synced for node config
2024-09-27T13:47:35.603920253Z stderr F I0927 13:47:35.603742       1 shared_informer.go:320] Caches are synced for service config
2024-09-27T13:47:35.604064303Z stderr F I0927 13:47:35.603757       1 shared_informer.go:320] Caches are synced for endpoint slice config

looks the same if I create a new cluster without externalAddress