siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.86k stars 549 forks source link

Possible talosctl upgrade-k8s issues with wireguard interfaces present #6816

Closed junaru closed 1 year ago

junaru commented 1 year ago

Hello,

Recent versions of talosctl upgrade-k8s may be choosing wrong IP for nodes if custom wireguard interfaces are present.

Ive been upgrading cluster from v0.14 to latest stable via upgrade paths detailed in https://www.talos.dev/v1.3/talos-guides/upgrading-talos/#supported-upgrade-paths

The cluster consists of four nodes - two masters and two worker nodes with kubespan (probabbly irrelevant) and network configuration where they get public IPv4 via DHCP on eth0 and a staticly defined wireguard interface named storage0 for communication with storage network.

One of control plane nodes network config:

       network:
            hostname: master # Used to statically set the hostname for the machine.
            interfaces:
                - interface: eth0 # The interface name.
                  addresses:
                    - 46.166.167.93/28
                  routes:
                    - network: 0.0.0.0/0 # The route's network (destination).
                      gateway: 46.166.167.94 # The route's gateway (if empty, creates link scope route).
                      metric: 1024 # The optional metric for the route.
                  mtu: 0 # The interface's MTU.
                - interface: storage0 # The interface name.
                  addresses:
                    - 10.255.0.3/24
                  mtu: 1500 # The interface's MTU.
                  wireguard:
                    privateKey: <redacted> # Specifies a private key configuration (base64 encoded).
                    listenPort: 51111 # Specifies a device's listening port.
                    peers:
                        - publicKey: <redacted> # Specifies the public key of this peer.
                          endpoint: 46.166.167.84:51111 # Specifies the endpoint of this peer entry.
                          persistentKeepaliveInterval: 10s # Specifies the persistent keepalive interval for this peer.
                          allowedIPs:
                            - 10.255.0.100/32
                        - publicKey: <redacted> # Specifies the public key of this peer.
                          endpoint: 212.117.19.200:51111 # Specifies the endpoint of this peer entry.
                          persistentKeepaliveInterval: 10s # Specifies the persistent keepalive interval for this peer.
                          allowedIPs:
                            - 10.255.0.101/32
            nameservers:
                - 9.9.9.9
                - 1.1.1.1
            kubespan:
                enabled: true # Enable the KubeSpan feature.

The upgrade path taken (retyped, with talosctl version up front):

1.0.0 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade --image ghcr.io/siderolabs/installer:v1.0.0 -f
1.0.0 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade-k8s --to 1.23.5
1.0.6 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade --image ghcr.io/siderolabs/installer:v1.0.6 -f
1.0.6 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade-k8s --to 1.23.6
1.1.2 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade --image ghcr.io/siderolabs/installer:v1.1.2 -f
1.1.2 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade-k8s --to 1.24.3
1.1.2 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade --image ghcr.io/siderolabs/installer:v1.1.2 -f
1.1.2 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade-k8s --to 1.24.3
1.2.8 talosctl-darwin-arm64 --talosconfig ~/talosconfig -n <nodes> upgrade --image ghcr.io/siderolabs/installer:v1.2.8 -f

And finally the call that broke k8s:

1.2.8 ./talosctl-darwin-arm64 --talosconfig ~/talosconfig -n master.debeselis.dom0.lt upgrade-k8s --to 1.25.5

automatically detected the lowest Kubernetes version 1.24.3
checking for resource APIs to be deprecated in version 1.25.5
WARNING: found resources which are going to be deprecated/migrated in the version 1.25.5
RESOURCE                                  COUNT
endpointslices.v1beta1.discovery.k8s.io   12
events.v1beta1.events.k8s.io              536
podsecuritypolicies.v1beta1.policy        1

discovered controlplane nodes ["**10.255.0.3**" "**10.255.0.4**"]
discovered worker nodes ["**10.255.0.1**" "**10.255.0.2**"]
updating "kube-apiserver" to version "1.25.5"
 > "**10.255.0.3**": starting update
 > update kube-apiserver: v1.24.3 -> 1.25.5
 > "**10.255.0.3**": machine configuration patched
 > "**10.255.0.3**": waiting for kube-apiserver pod update
failed updating service "kube-apiserver": error updating node "**10.255.0.3**": 4 error(s) occurred:
    config version mismatch: got "1", expected "2"
    Get "https://cp.debeselis.dom0.lt:6443/api/v1/namespaces/kube-system/pods?labelSelector=k8s-app+%3D+kube-apiserver": dial tcp 87.247.71.192:6443: connect: connection refused
    Get "https://cp.debeselis.dom0.lt:6443/api/v1/namespaces/kube-system/pods?labelSelector=k8s-app+%3D+kube-apiserver": dial tcp 46.166.167.93:6443: connect: connection refused
    timeout

Notice the IPs in above output, they are all wireguard IPs that are bound to storage0 interfaces on all the nodes. Guessing what happened is talosctl misidentified them as public ones and pushed them somewhere deeper into the stack.

After this i immediately lost access to control plane. All nodes are still accessible via talosctl (can query machineconfig, reboot, etc), but anything involving kubectl fails.

Just before the upgrade-k8s --to 1.25.5 cluster was functioning normally:

# kubectl get nodes
NAME       STATUS                        ROLES           AGE    VERSION
master     Ready                         control-plane   411d   v1.24.3
mistress   Ready                         control-plane   411d   v1.25.5
node1      Ready                         <none>          411d   v1.24.3
node2      NotReady,SchedulingDisabled   <none>          410d   v1.24.3

A dry run was also completed successfully just before the upgrade and the IPs reported were wireguard ones from storage0. In previous updates public IPv4 addresses were present in the output like the one below.

# ./talosctl-darwin-arm64 --talosconfig ~/talosconfig -n master.debeselis.dom0.lt upgrade-k8s --to 1.25.5 --dry-run
automatically detected the lowest Kubernetes version 1.24.3
checking for resource APIs to be deprecated in version 1.25.5
WARNING: found resources which are going to be deprecated/migrated in the version 1.25.5
RESOURCE                                  COUNT
endpointslices.v1beta1.discovery.k8s.io   12
events.v1beta1.events.k8s.io              528
podsecuritypolicies.v1beta1.policy        1

discovered controlplane nodes ["10.255.0.3" "10.255.0.4"]
discovered worker nodes ["10.255.0.1" "10.255.0.2"]
updating "kube-apiserver" to version "1.25.5"
 > "10.255.0.3": starting update
 > update kube-apiserver: v1.24.3 -> 1.25.5
 > skipped in dry-run
 > "10.255.0.4": starting update
updating "kube-controller-manager" to version "1.25.5"
 > "10.255.0.3": starting update
 > update kube-controller-manager: v1.24.3 -> 1.25.5
 > skipped in dry-run
 > "10.255.0.4": starting update
updating "kube-scheduler" to version "1.25.5"
 > "10.255.0.3": starting update
 > update kube-scheduler: v1.24.3 -> 1.25.5
 > skipped in dry-run
 > "10.255.0.4": starting update
updating daemonset "kube-proxy" to version "1.25.5"
skipped in dry-run
updating kubelet to version "1.25.5"
 > "10.255.0.3": starting update
 > update kubelet: 1.24.3 -> 1.25.5
 > skipped in dry-run
 > "10.255.0.4": starting update
 > "10.255.0.1": starting update
 > update kubelet: 1.24.3 -> 1.25.5
 > skipped in dry-run
 > "10.255.0.2": starting update
 > update kubelet: 1.24.3 -> 1.25.5
 > skipped in dry-run
updating manifests
 > processing manifest Secret bootstrap-token-ga0uvk
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding system-bootstrap-approve-node-client-csr
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding system-bootstrap-node-bootstrapper
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding system-bootstrap-node-renewal
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding system:default-sa
 < apply skipped in dry run
 > processing manifest ClusterRole psp:privileged
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding psp:privileged
 < apply skipped in dry run
 > processing manifest PodSecurityPolicy privileged
 < apply skipped in dry run
 > processing manifest ClusterRole flannel
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding flannel
 < apply skipped in dry run
 > processing manifest ServiceAccount flannel
 < apply skipped in dry run
 > processing manifest ConfigMap kube-flannel-cfg
 < apply skipped in dry run, diff:
  strings.Join({
    "apiVersion: v1\ndata:\n  cni-conf.json: |\n    {\n      \"name\": \"cbr",
    "0\",\n      \"cniVersion\": \"",
-   "0.3.1",
+   "1.0.0",
    "\",\n      \"plugins\": [\n        {\n          \"type\": \"flannel\",\n   ",
    "       \"delegate\": {\n            \"hairpinMode\": true,\n          ",
    ... // 341 identical bytes
  }, "")

 > processing manifest DaemonSet kube-flannel
 < apply skipped in dry run, diff:
  strings.Join({
    ... // 733 identical bytes
    "     fieldRef:\n              apiVersion: v1\n              fieldP",
    "ath: status.podIP\n        image: ghcr.io/siderolabs/flannel:v0.1",
-   "8.1",
+   "9.2",
    "\n        imagePullPolicy: IfNotPresent\n        name: kube-flanne",
    "l\n        resources: {}\n        securityContext:\n          capab",
    ... // 473 identical bytes
    "f.json\n        - /etc/cni/net.d/10-flannel.conflist\n        comm",
    "and:\n        - cp\n        image: ghcr.io/siderolabs/flannel:v0.1",
-   "8.1",
+   "9.2",
    "\n        imagePullPolicy: IfNotPresent\n        name: install-con",
    "fig\n        resources: {}\n        terminationMessagePath: /dev/t",
    ... // 164 identical bytes
    "lannel/\n          name: flannel-cfg\n      - command:\n        - /",
    "install-cni.sh\n        image: ghcr.io/siderolabs/install-cni:v1.",
-   "1.0-2-gcb03a5d",
+   "2.0-2-gf14175f",
    "\n        imagePullPolicy: IfNotPresent\n        name: install-cni",
    "\n        resources: {}\n        terminationMessagePath: /dev/term",
    ... // 1125 identical bytes
  }, "")

 > processing manifest ServiceAccount kube-proxy
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding kube-proxy
 < apply skipped in dry run
 > processing manifest ServiceAccount coredns
 < apply skipped in dry run
 > processing manifest ClusterRoleBinding system:coredns
 < apply skipped in dry run
 > processing manifest ClusterRole system:coredns
 < apply skipped in dry run
 > processing manifest ConfigMap coredns
 < apply skipped in dry run
 > processing manifest Deployment coredns
 < apply skipped in dry run
 > processing manifest Service kube-dns
 < apply skipped in dry run
 > processing manifest ConfigMap kubeconfig-in-cluster
 < apply skipped in dry run

Not sure if it's wireguard related but its the only part from upgrade-k8s output that looks weird.

Theres also the fact i'm running two control plane nodes so split brain issues could also be a contributing factor.

Is there any troubleshooting steps/guides i could take to try and recover this? The cluster is pretty much a home test lab but its been running for a year and its purpose was exactly to see how k8s/talos could break and see how we can recover from that.

Any insight would be highly appreciated, thank you!

node_boot_log.txt

smira commented 1 year ago

The IPs shouldn't matter that much, as it should be the INTERNAL IP from kubectl get nodes -o wide.

What happened most probably, and that's what upgrade-k8s tried to warn you about, is that you still have Pod Security Policy enabled.

See https://www.talos.dev/v1.2/talos-guides/upgrading-talos/#podsecuritypolicy-removal, you just need to flip the value, and kube-apiserver should start back.

junaru commented 1 year ago

you just need to flip the value, and kube-apiserver should start back.

It worked! You just made my day!

Assumed "This setting defaulted to true since Talos v1.0.0 release." meant the default value was applied even if the key was undefined so didn't even try toggling that.

Please feel free to delete this issue as it contains a wall of misinformation at this point.

Thank you again and have a great weekend!