siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.86k stars 549 forks source link

Bootstrapping on Raspberry pi 4 model B fails at step 14 #3033

Closed Sonlis closed 11 months ago

Sonlis commented 3 years ago

Bug Report

Description

I followed the instructions on Talos doc: https://www.talos.dev/docs/v0.8/single-board-computers/rpi_4/#updating-the-eeprom until bootstrapping the node part. I use the interactive flag, only changes I did are the cluster name and the hostname. On step 14 of the bootstrapping, phase startEverything, the bootstrap displays "Health check failed: timed not ready" and stops there. I also tried to uncheck DHCP and setup a static IP on my own, and match it with the control plane address, not fixing it.

Logs

[459.046528]servicenetworkd: Started task networks (PID 3060) for container network [459.067572]servicerouterd: Started task routers(PID 3064) for container routerd [463.465130]servicenetworkd: Health check successful [463.477263]servicetimed: Running pre state [463.678220]servicetrustd: waiting for service "timed" to be "up" [463.732484]servicecri: waiting for service "etcd" to be "up" [463.732484]serviceapid: waiting for service "timed" to be "up" [463.750343]servicekubelet: waiting for service "cri" to be "up", service "timed" to be up [463.765099]serviceetcd: waiting for service "timed" to be "up" [464.360878]unpacking talos/timed (sha256:ec708ceab993ed2671774c30bf2582638429ee5533d11c32aa35a3eb46b67a5f) [465.593584]servicetimed: Started task timed (PID 3133) for container timed [465.983903]servicetimed: Health check failed: timed is not ready [1490.104010]servicetimed: Error running Containerd(timed), going to restart forever: task "timed" failed: exit code 1

Environment

Sonlis commented 3 years ago

I close the issue since not changing the hostname fixed it, I will investigate this path. Sorry for opening an issue too quickly.

andrewrynhard commented 3 years ago

I close the issue since not changing the hostname fixed it, I will investigate this path. Sorry for opening an issue too quickly.

Hmm, we should support changing the hostname. Sounds like a bug @Unix4ever ?

Unix4ever commented 3 years ago

We should, right. It was working fine when I was testing it last time, maybe there's some regression...

Unix4ever commented 3 years ago

I've tried to reproduce it locally on QEMU, but no luck: worked fine. So I guess there's some additional factor that makes it fail, which we don't yet know. I wonder if you can grab timed logs. We may find something helpful there.

talosctl logs timed -n <node ip> -f
Sonlis commented 3 years ago

I have tried to reproduce it, by only changing the IP address: I get the same problem. I cannot get the logs sadly, as the node won't even bootstrap so I get : error fetching logs: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 192.168.0.129:50000: connect: connection refused". Last time I wrote by hand the time, I will edit the original post to include time. Also a note: Because I am on Mac, I had to change this command in "updating the EEPROM": sudo mkfs.fat -I /dev/mmcblk0 by sudo diskutil eraseDisk FAT32 RPI2 MBRFormat /dev/disk2. I don't think it has to do anything with it, just letting you know that I had to change one step from the getting started guide. PS: there are also 2 errors that slipped through my eyes the first time, as it goes quickly. But there you go:

smira commented 3 years ago

@Sonlis one way to debug the issue as we're flying in "blind" mode until apid is running (to make talosctl logs work) is to enable debug flag in the config to output all the logs to the console. This might produce lots of logs, but it should at least give some clue to understand what is making timed crash: https://www.talos.dev/docs/v0.8/reference/configuration/#config

debug: true on the top level of the config

Sonlis commented 3 years ago

@smira I have enabled this option, but it does not give more logs.

But uploading the Config through yaml instead of interactive works though. So maybe an error in the interactive installer ?

smira commented 3 years ago

Interesting point if that is something with interactive installer

Unix4ever commented 3 years ago

I wonder why setting hostname breaks timed. Btw, what hostname have you tried to set? Maybe I can try using the same settings on my RPI4 4GB.

I assume when you tried configuring Talos one more time without the hostname you re-flashed SD card, is that correct? Does it reproduce each time you dd image to the SD card and run interactive installer?

Sonlis commented 3 years ago

I have tried Talos-master if I recall correctly. Every time it failed, I re-formatted the SD card, install the EEPROM and dd the image to the SD, then went for interactive-installer.

ossfellow commented 3 years ago

I have used the interactive mode, as well as the yaml config file, many times and I've always set the hostname, without an issue! So, I'm not saying there couldn't be an issue, but, if upwards of 200-300 tests is considered good enough, then, I can confirm it cannot be an issue related to setting the hostname. @Sonlis, you don't need to update the EEPROM all the time; it's done only once and saved in the EEPROM (which is on RPI-4B board), unless you need to flash it again (upgrade...). Additionally, you'd see the health check failure issue, a couple of times, because some components (API Server, bootkube, etc.) might not be up at a certain stage yet. By stage 14, you should be able to run some queries against the cluster/node, using talosctl By the way.

xavier83 commented 2 years ago

It is failing for me as well on first try on a loop, regardless of if I set the hostname or not.

Logs:

192.168.13.124: user: warning: [2022-08-06T20:06:35.921770935Z]: [talos] phase userSetup (14/19): 1 tasks(s)
192.168.13.124: user: warning: [2022-08-06T20:06:35.928221935Z]: [talos] task writeUserFiles (1/1): starting
192.168.13.124: user: warning: [2022-08-06T20:06:35.934537935Z]: [talos] task writeUserFiles (1/1): done, 6.355885ms
192.168.13.124: user: warning: [2022-08-06T20:06:35.941479935Z]: [talos] phase userSetup (14/19): done, 19.720024ms
192.168.13.124: user: warning: [2022-08-06T20:06:35.948252935Z]: [talos] phase lvm (15/19): 1 tasks(s)
192.168.13.124: user: warning: [2022-08-06T20:06:35.953864935Z]: [talos] task activateLogicalVolumes (1/1): starting
192.168.13.124: user: warning: [2022-08-06T20:06:36.250460935Z]: [talos] task activateLogicalVolumes (1/1): done, 296.607628ms
192.168.13.124: user: warning: [2022-08-06T20:06:36.260225935Z]: [talos] phase lvm (15/19): done, 311.956832ms
192.168.13.124: user: warning: [2022-08-06T20:06:36.267449935Z]: [talos] phase startEverything (16/19): 1 tasks(s)
192.168.13.124: user: warning: [2022-08-06T20:06:36.274537935Z]: [talos] task startAllServices (1/1): starting
192.168.13.124: user: warning: [2022-08-06T20:06:36.281333935Z]: [talos] task startAllServices (1/1): waiting for 8 services
192.168.13.124: user: warning: [2022-08-06T20:06:36.289656935Z]: [talos] service[cri](Waiting): Waiting for network
192.168.13.124: user: warning: [2022-08-06T20:06:36.297119935Z]: [talos] service[trustd](Waiting): Waiting for service "containerd" to be "up", time sync, network
192.168.13.124: user: warning: [2022-08-06T20:06:36.309805935Z]: [talos] service[etcd](Waiting): Waiting for service "cri" to be "up", time sync, network
192.168.13.124: user: warning: [2022-08-06T20:06:36.321133935Z]: [talos] service[cri](Preparing): Running pre state
192.168.13.124: user: warning: [2022-08-06T20:06:36.328701935Z]: [talos] service[trustd](Preparing): Running pre state
192.168.13.124: user: warning: [2022-08-06T20:06:36.336504935Z]: [talos] service[cri](Preparing): Creating service runner
192.168.13.124: user: warning: [2022-08-06T20:06:36.344227935Z]: [talos] task startAllServices (1/1): service "apid" to be "up", service "containerd" to be "up", service "cri" to be "up", service "etcd" to be "up", service "kubelet" to be "up", service "machined" to be "up", service "trustd" to be "up", service "udevd" to be "up"
192.168.13.124: user: warning: [2022-08-06T20:06:36.371825935Z]: [talos] service[trustd](Preparing): Creating service runner
192.168.13.124: user: warning: [2022-08-06T20:06:36.398611935Z]: [talos] service[cri](Running): Process Process(["/bin/containerd" "--address" "/run/containerd/containerd.sock" "--config" "/etc/cri/containerd.toml"]) started with PID 3403
192.168.13.124: user: warning: [2022-08-06T20:06:36.515276935Z]: [talos] service[kubelet](Waiting): Waiting for service "cri" to be "up"
192.168.13.124: user: warning: [2022-08-06T20:06:36.559197935Z]: [talos] service[trustd](Running): Started task trustd (PID 3437) for container trustd
192.168.13.124: user: warning: [2022-08-06T20:06:37.321681935Z]: [talos] service[etcd](Waiting): Waiting for service "cri" to be "up"
192.168.13.124: user: warning: [2022-08-06T20:06:37.350441935Z]: [talos] service[cri](Running): Health check successful
192.168.13.124: user: warning: [2022-08-06T20:06:37.358056935Z]: [talos] service[kubelet](Preparing): Running pre state
192.168.13.124: user: warning: [2022-08-06T20:06:37.366271935Z]: [talos] service[etcd](Preparing): Running pre state
192.168.13.124: user: warning: [2022-08-06T20:06:37.407044935Z]: [talos] service[trustd](Running): Health check successful
192.168.13.124: user: warning: [2022-08-06T20:06:39.866347935Z]: [talos] service[kubelet](Preparing): Creating service runner
192.168.13.124: user: warning: [2022-08-06T20:06:43.695280935Z]: [talos] service[etcd](Preparing): Creating service runner
^[[23~192.168.13.124: user: warning: [2022-08-06T20:06:51.337853935Z]: [talos] task startAllServices (1/1): service "etcd" to be "up", service "kubelet" to be "up"
192.168.13.124: user: warning: [2022-08-06T20:06:56.669706935Z]: [talos] service[etcd](Running): Started task etcd (PID 3521) for container etcd
192.168.13.124: user: warning: [2022-08-06T20:06:56.738848935Z]: [talos] service[kubelet](Running): Started task kubelet (PID 3520) for container kubelet
192.168.13.124: user: warning: [2022-08-06T20:06:56.749759935Z]: [talos] service[etcd](Waiting): Error running Containerd(etcd), going to restart forever: task "etcd" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:06:56.806735935Z]: [talos] service[kubelet](Waiting): Error running Containerd(kubelet), going to restart forever: task "kubelet" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:07:00.826991935Z]: [talos] kubernetes endpoint watch error {"component": "controller-runtime", "controller": "k8s.EndpointController", "error": "failed to list *v1.Endpoints: Get \x5c"https://192.168.13.124:6443/api/v1/namespaces/default/endpoints?fieldSelector=metadata.name%3Dkubernetes&limit=500&resourceVersion=0\x5c": dial tcp 192.168.13.124:6443: connect: connection refused"}
192.168.13.124: user: warning: [2022-08-06T20:07:01.941108935Z]: [talos] service[etcd](Running): Started task etcd (PID 3627) for container etcd
192.168.13.124: user: warning: [2022-08-06T20:07:02.012542935Z]: [talos] service[etcd](Waiting): Error running Containerd(etcd), going to restart forever: task "etcd" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:07:02.026898935Z]: [talos] service[kubelet](Running): Started task kubelet (PID 3649) for container kubelet
192.168.13.124: user: warning: [2022-08-06T20:07:02.095119935Z]: [talos] service[kubelet](Waiting): Error running Containerd(kubelet), going to restart forever: task "kubelet" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:07:06.337799935Z]: [talos] task startAllServices (1/1): service "etcd" to be "up", service "kubelet" to be "up"
192.168.13.124: user: warning: [2022-08-06T20:07:09.136223935Z]: [talos] service[etcd](Running): Started task etcd (PID 3768) for container etcd
192.168.13.124: user: warning: [2022-08-06T20:07:09.186552935Z]: [talos] service[kubelet](Running): Started task kubelet (PID 3769) for container kubelet
192.168.13.124: user: warning: [2022-08-06T20:07:10.642716935Z]: [talos] service[etcd](Waiting): Error running Containerd(etcd), going to restart forever: task "etcd" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:07:10.656891935Z]: [talos] service[kubelet](Waiting): Error running Containerd(kubelet), going to restart forever: task "kubelet" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:07:17.227499935Z]: [talos] service[etcd](Running): Started task etcd (PID 3892) for container etcd
192.168.13.124: user: warning: [2022-08-06T20:07:17.270486935Z]: [talos] service[kubelet](Running): Started task kubelet (PID 3893) for container kubelet
192.168.13.124: user: warning: [2022-08-06T20:07:18.719038935Z]: [talos] service[kubelet](Waiting): Error running Containerd(kubelet), going to restart forever: task "kubelet" failed: exit code 1
192.168.13.124: user: warning: [2022-08-06T20:07:18.733752935Z]: [talos] service[etcd](Waiting): Error running Containerd(etcd), going to restart 
xavier83 commented 2 years ago

on rpi4 looks like etcd might be an amd64 binary instead of arm64, I am running talos v1.1.2 on rpi4b

talosctl -n <node ip> logs etcd
<node ip>: exec /usr/local/bin/etcd: exec format error
<node ip>: exec /usr/local/bin/etcd: exec format error
<node ip>: exec /usr/local/bin/etcd: exec format error
.
.
.
frezbo commented 2 years ago

on rpi4 looks like etcd might be an amd64 binary instead of arm64, I am running talos v1.1.2 on rpi4b

it's the job of CRI to pull the right image for the right arch, it might be that containerd failed to detect the right arch. Which release image did you used to flash the pi? Also could you try resetting the node and see if this happens, talosctl reset --system-labels-to-wipe=EPHEMERAL

xavier83 commented 2 years ago

I've flash rpi with this latest release image v1.1.2. Also on trying to reset getting this error

$talosctl reset --system-labels-to-wipe=EPHEMERAL -n <node ip>
error executing reset: 1 error occurred:
        * <node ip>: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp <node ip>:50000: i/o timeout"

even kubelet throws similar error

$talosctl -n <node ip> logs -f etcd
<node ip>: exec /usr/local/bin/etcd: exec format error
<node ip>: exec /usr/local/bin/etcd: exec format error
<node ip>: exec /usr/local/bin/etcd: exec format error
.
.
.
$talosctl images
ghcr.io/siderolabs/flannel:v0.18.1
ghcr.io/siderolabs/install-cni:v1.1.0-2-gcb03a5d
docker.io/coredns/coredns:1.9.3
gcr.io/etcd-development/etcd:v3.5.4
k8s.gcr.io/kube-apiserver:v1.24.3
k8s.gcr.io/kube-controller-manager:v1.24.3
k8s.gcr.io/kube-scheduler:v1.24.3
k8s.gcr.io/kube-proxy:v1.24.3
ghcr.io/siderolabs/kubelet:v1.24.3
ghcr.io/siderolabs/installer:v1.1.2
k8s.gcr.io/pause:3.6
frezbo commented 2 years ago

seems the install is completely broken, could you re-flash and try again

xavier83 commented 2 years ago

tried reflashing with previous stable release v1.1.1 and that seems to boot etcd up fine.

xavier83 commented 2 years ago

the control plane endpoints never seems to come up though

$talosctl -n <node ip> service
NODE             SERVICE      STATE     HEALTH   LAST CHANGE   LAST EVENT
<node ip>   apid         Running   OK       6m56s ago     Health check successful
<node ip>   containerd   Running   OK       7m2s ago      Health check successful
<node ip>   cri          Running   OK       6m32s ago     Health check successful
<node ip>   etcd         Running   OK       6m4s ago      Health check successful
<node ip>   kubelet      Running   OK       5m53s ago     Health check successful
<node ip>   machined     Running   ?        7m9s ago      Service started as goroutine
<node ip>   trustd       Running   OK       6m32s ago     Health check successful
<node ip>   udevd        Running   OK       6m34s ago     Health check successful
$talosctl dmesg -f -n <node ip>
...
<node ip>: user: warning: [2022-08-07T06:38:36.358257301Z]: [talos] service[kubelet](Running): Started task kubelet (PID 3496) for container kubelet
<node ip>: user: warning: [2022-08-07T06:38:36.369711301Z]: [talos] cleaning up static pod "/etc/kubernetes/manifests/talos-kube-apiserver.yaml" {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController"}
<node ip>: user: warning: [2022-08-07T06:38:36.388575301Z]: [talos] cleaning up static pod "/etc/kubernetes/manifests/talos-kube-controller-manager.yaml" {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController"}
<node ip>: user: warning: [2022-08-07T06:38:36.407953301Z]: [talos] cleaning up static pod "/etc/kubernetes/manifests/talos-kube-scheduler.yaml" {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController"}
<node ip>: user: warning: [2022-08-07T06:38:36.461887301Z]: [talos] service[etcd](Running): Started task etcd (PID 3528) for container etcd
<node ip>: user: warning: [2022-08-07T06:38:43.851543301Z]: [talos] service[etcd](Running): Health check successful
<node ip>: user: warning: [2022-08-07T06:38:43.872347301Z]: [talos] writing static pod "/etc/kubernetes/manifests/talos-kube-apiserver.yaml" {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController"}
<node ip>: user: warning: [2022-08-07T06:38:43.904317301Z]: [talos] writing static pod "/etc/kubernetes/manifests/talos-kube-controller-manager.yaml" {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController"}
<node ip>: user: warning: [2022-08-07T06:38:43.925992301Z]: [talos] writing static pod "/etc/kubernetes/manifests/talos-kube-scheduler.yaml" {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController"}
<node ip>: user: warning: [2022-08-07T06:38:44.460787301Z]: [talos] task startAllServices (1/1): service "kubelet" to be "up"
<node ip>: user: warning: [2022-08-07T06:38:47.208012301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-yvhv5k: Get \x5c"https://localhost:6443/api?timeout=32s\x5c": dial tcp [::1]:6443: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:38:47.575138301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController", "error": "error refreshing pod status: error fetching pod status: Get \x5c"https://127.0.0.1:10250/pods/?timeout=30s\x5c": dial tcp 127.0.0.1:10250: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:38:51.177581301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-yvhv5k: Get \x5c"https://localhost:6443/api?timeout=32s\x5c": dial tcp [::1]:6443: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:38:51.656212301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-yvhv5k: Get \x5c"https://localhost:6443/api?timeout=32s\x5c": dial tcp [::1]:6443: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:38:54.881584301Z]: [talos] service[kubelet](Running): Health check successful
<node ip>: user: warning: [2022-08-07T06:38:54.890774301Z]: [talos] task startAllServices (1/1): done, 40.493796873s
<node ip>: user: warning: [2022-08-07T06:38:54.899622301Z]: [talos] phase startEverything (16/19): done, 40.510250763s
<node ip>: user: warning: [2022-08-07T06:38:54.908654301Z]: [talos] phase labelMaster (17/19): 1 tasks(s)
<node ip>: user: warning: [2022-08-07T06:38:54.916424301Z]: [talos] task labelNodeAsMaster (1/1): starting
<node ip>: user: warning: [2022-08-07T06:38:54.932942301Z]: [talos] retrying error: Get "https://<node ip>:6443/api/v1/nodes/talos-control-1?timeout=30s": dial tcp <node ip>:6443: connect: connection refused
<node ip>: user: warning: [2022-08-07T06:38:54.972581301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-yvhv5k: Get \x5c"https://localhost:6443/api?timeout=32s\x5c": dial tcp [::1]:6443: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:38:58.634897301Z]: [talos] kubernetes endpoint watch error {"component": "controller-runtime", "controller": "k8s.EndpointController", "error": "failed to list *v1.Endpoints: Get \x5c"https://<node ip>:6443/api/v1/namespaces/default/endpoints?fieldSelector=metadata.name%3Dkubernetes&limit=500&resourceVersion=0\x5c": dial tcp <node ip>:6443: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:39:05.422844301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController", "error": "error refreshing pod status: error fetching pod status: an error on the server (\x5c"Authorization error (user=apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)\x5c") has prevented the request from succeeding"}
<node ip>: user: warning: [2022-08-07T06:39:10.272273301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-yvhv5k: Get \x5c"https://localhost:6443/api?timeout=32s\x5c": dial tcp [::1]:6443: connect: connection refused"}
<node ip>: user: warning: [2022-08-07T06:39:21.419420301Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController", "error": "error refreshing pod status: error fetching pod status: an error on the server (\x5c"Authorization error (user=apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)\x5c") has prevented the request from succeeding"}
frezbo commented 2 years ago

what does talosctl containers -k show and also talosctl logs -k <id of kube-apiserver>, it seems there's is some issue with the generated config probably. It would be nice if you could provide the commands used for generating the config and whether any modifications was done to the generated config. Probably it's easier to join our slack for an easier conversation

xavier83 commented 2 years ago
$talosctl containers -k -n <node ip>
NODE             NAMESPACE   ID                                                                               IMAGE                                        PID    STATUS
<node ip>   k8s.io      kube-system/kube-apiserver-talos-control-1                                       k8s.gcr.io/pause:3.6                         5210   SANDBOX_READY
<node ip>   k8s.io      kube-system/kube-apiserver-talos-control-1                                       k8s.gcr.io/pause:3.6                         3754   SANDBOX_READY
<node ip>   k8s.io      └─ kube-system/kube-apiserver-talos-control-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            0      CONTAINER_EXITED
<node ip>   k8s.io      └─ kube-system/kube-apiserver-talos-control-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            0      CONTAINER_EXITED
<node ip>   k8s.io      └─ kube-system/kube-apiserver-talos-control-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            0      CONTAINER_EXITED
<node ip>   k8s.io      └─ kube-system/kube-apiserver-talos-control-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            4716   CONTAINER_RUNNING
<node ip>   k8s.io      kube-system/kube-controller-manager-talos-control-1                              k8s.gcr.io/pause:3.6                         3766   SANDBOX_READY
<node ip>   k8s.io      └─ kube-system/kube-controller-manager-talos-control-1:kube-controller-manager   k8s.gcr.io/kube-controller-manager:v1.24.3   0      CONTAINER_CREATED
<node ip>   k8s.io      kube-system/kube-scheduler-talos-control-1                                       k8s.gcr.io/pause:3.6                         3691   SANDBOX_READY
<node ip>   k8s.io      └─ kube-system/kube-scheduler-talos-control-1:kube-scheduler                     k8s.gcr.io/kube-scheduler:v1.24.3            0      CONTAINER_CREATED

there seem to be some cert auth erros in apiserver logs

<node ip>: 2022-08-07T07:23:42.768472247Z stderr F I0807 07:23:42.768132       1 trace.go:205] Trace[1337185868]: "GuaranteedUpdate etcd3" type:*v1.Endpoints (07-Aug-2022 07:23:35.749) (total time: 7018ms):
<node ip>: 2022-08-07T07:23:42.768512987Z stderr F Trace[1337185868]: [7.018171569s] [7.018171569s] END
<node ip>: 2022-08-07T07:23:42.768697579Z stderr F E0807 07:23:42.768221       1 controller.go:240] unable to sync kubernetes service: etcdserver: request timed out
<node ip>: 2022-08-07T07:23:42.787298715Z stderr F {"level":"warn","ts":"2022-08-07T07:23:42.786Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x40012f1c00/127.0.0.1:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: request timed out"}
<node ip>: 2022-08-07T07:23:42.869611976Z stderr F E0807 07:23:42.869193       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:42Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:42Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:23:45.874121452Z stderr F E0807 07:23:45.873750       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:45Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:45Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:23:48.878216372Z stderr F E0807 07:23:48.877901       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:48Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:48Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:23:50.27423494Z stderr F I0807 07:23:50.272924       1 trace.go:205] Trace[589616113]: "Create" url:/api/v1/namespaces/default/events,user-agent:kube-apiserver/v1.24.3 (linux/arm64) kubernetes/aef86a9,audit-id:2f49aad3-880d-45db-ad8f-4da5008056f9,client:::1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (07-Aug-2022 07:23:35.779) (total time: 14493ms):
<node ip>: 2022-08-07T07:23:50.274591272Z stderr F Trace[589616113]: ---"Object stored in database" 14492ms (07:23:50.272)
<node ip>: 2022-08-07T07:23:50.274678512Z stderr F Trace[589616113]: [14.493497064s] [14.493497064s] END
<node ip>: 2022-08-07T07:23:50.281010143Z stderr F I0807 07:23:50.280563       1 trace.go:205] Trace[634576640]: "Get" url:/api/v1/namespaces/kube-system,user-agent:kube-apiserver/v1.24.3 (linux/arm64) kubernetes/aef86a9,audit-id:5a48d524-a671-4150-8af2-df4ce1d5b963,client:::1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (07-Aug-2022 07:23:44.207) (total time: 6072ms):
<node ip>: 2022-08-07T07:23:50.281157605Z stderr F Trace[634576640]: ---"About to write a response" 6072ms (07:23:50.280)
<node ip>: 2022-08-07T07:23:50.281194568Z stderr F Trace[634576640]: [6.072439239s] [6.072439239s] END
<node ip>: 2022-08-07T07:23:50.284519124Z stderr F I0807 07:23:50.282428       1 trace.go:205] Trace[316781015]: "GuaranteedUpdate etcd3" type:*core.RangeAllocation (07-Aug-2022 07:23:35.756) (total time: 14525ms):
<node ip>: 2022-08-07T07:23:50.284667141Z stderr F Trace[316781015]: ---"initial value restored" 11265ms (07:23:47.022)
<node ip>: 2022-08-07T07:23:50.284704641Z stderr F Trace[316781015]: ---"Transaction committed" 3259ms (07:23:50.281)
<node ip>: 2022-08-07T07:23:50.284735752Z stderr F Trace[316781015]: [14.525267756s] [14.525267756s] END
<node ip>: 2022-08-07T07:23:50.28835275Z stderr F I0807 07:23:50.287828       1 trace.go:205] Trace[1027068449]: "Get" url:/api/v1/namespaces/default,user-agent:kube-apiserver/v1.24.3 (linux/arm64) kubernetes/aef86a9,audit-id:8a977d3f-477e-485f-96e0-48081fd335cf,client:::1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (07-Aug-2022 07:23:42.774) (total time: 7513ms):
<node ip>: 2022-08-07T07:23:50.288475583Z stderr F Trace[1027068449]: ---"About to write a response" 7513ms (07:23:50.287)
<node ip>: 2022-08-07T07:23:50.288509898Z stderr F Trace[1027068449]: [7.51331667s] [7.51331667s] END
<node ip>: 2022-08-07T07:23:51.883484025Z stderr F E0807 07:23:51.883060       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:51Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:51Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:23:54.88813353Z stderr F E0807 07:23:54.887667       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:54Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:54Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:23:55.139120185Z stderr F I0807 07:23:55.132366       1 trace.go:205] Trace[2003737638]: "Get" url:/api/v1/namespaces/kube-public,user-agent:kube-apiserver/v1.24.3 (linux/arm64) kubernetes/aef86a9,audit-id:4e9aeda1-5dcd-4653-95db-8941ebed3f50,client:::1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (07-Aug-2022 07:23:50.331) (total time: 4800ms):
<node ip>: 2022-08-07T07:23:55.139555183Z stderr F Trace[2003737638]: ---"About to write a response" 4800ms (07:23:55.131)
<node ip>: 2022-08-07T07:23:55.139607534Z stderr F Trace[2003737638]: [4.800269031s] [4.800269031s] END
<node ip>: 2022-08-07T07:23:55.139640201Z stderr F I0807 07:23:55.134312       1 trace.go:205] Trace[1009811277]: "GuaranteedUpdate etcd3" type:*v1.Endpoints (07-Aug-2022 07:23:50.298) (total time: 4835ms):
<node ip>: 2022-08-07T07:23:55.139669645Z stderr F Trace[1009811277]: ---"Transaction committed" 4825ms (07:23:55.134)
<node ip>: 2022-08-07T07:23:55.139698163Z stderr F Trace[1009811277]: [4.835685056s] [4.835685056s] END
<node ip>: 2022-08-07T07:23:55.149183166Z stderr F W0807 07:23:55.148861       1 lease.go:234] Resetting endpoints for master service "kubernetes" to [<node ip>]
<node ip>: 2022-08-07T07:23:55.170344826Z stderr F I0807 07:23:55.169896       1 controller.go:611] quota admission added evaluator for: endpoints
<node ip>: 2022-08-07T07:23:56.758182247Z stderr F I0807 07:23:56.757883       1 trace.go:205] Trace[1721911544]: "Get" url:/apis/discovery.k8s.io/v1/namespaces/default/endpointslices/kubernetes,user-agent:kube-apiserver/v1.24.3 (linux/arm64) kubernetes/aef86a9,audit-id:9562c830-6433-46d6-bfab-bc1116d9cbe5,client:::1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (07-Aug-2022 07:23:55.188) (total time: 1568ms):
<node ip>: 2022-08-07T07:23:56.758286117Z stderr F Trace[1721911544]: [1.568944953s] [1.568944953s] END
<node ip>: 2022-08-07T07:23:56.769886645Z stderr F I0807 07:23:56.769562       1 controller.go:611] quota admission added evaluator for: endpointslices.discovery.k8s.io
<node ip>: 2022-08-07T07:23:57.89256846Z stderr F E0807 07:23:57.892171       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:57Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:23:57Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:24:00.897959327Z stderr F E0807 07:24:00.897563       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:00Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:00Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:24:03.903286729Z stderr F E0807 07:24:03.902816       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:03Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:03Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:24:06.908549388Z stderr F E0807 07:24:06.908120       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:06Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:06Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:24:09.912516571Z stderr F E0807 07:24:09.912231       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:09Z is after 2022-08-07T06:48:55Z, verifying certificate SN=45300978456042199912394603477883528136, SKID=, AKID=A7:A5:53:55:17:52:0E:D5:19:D0:F1:9C:40:00:56:A7:F3:EB:0C:2A failed: x509: certificate has expired or is not yet valid: current time 2022-08-07T07:24:09Z is after 2022-08-07T06:48:55Z]"
<node ip>: 2022-08-07T07:24:11.706112652Z stderr F I0807 07:24:11.704310       1 trace.go:205] Trace[2007212016]: "Get" url:/api/v1/namespaces/default,user-agent:kube-apiserver/v1.24.3 (linux/arm64) kubernetes/aef86a9,audit-id:0bdfc782-d77a-4114-9ba4-67307510b0d4,client:::1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (07-Aug-2022 07:24:06.789) (total time: 4914ms):
<node ip>: 2022-08-07T07:24:11.706747259Z stderr F Trace[2007212016]: ---"About to write a response" 4913ms (07:24:11.703)
xavier83 commented 2 years ago

during talosctl apply-config the control plane node was set to the lan ip of the node. talosctl dmesg -f -n <node ip> was returning lots of connection refused errors to the control plane <node ip>:<6553>. I edited the machine config to use localhost instead and then it started scheduling the control plane containers. the above logs are after this change. without this change in <node lan ip> to <localhost> getting the following:

$talosctl containers -k -n <node lan ip>
NODE             NAMESPACE   ID                                                                               IMAGE                                        PID    STATUS
<node lan ip>   k8s.io      kube-system/kube-apiserver-talos-control-1                                       k8s.gcr.io/pause:3.6                         3703   SANDBOX_READY
<node lan ip>   k8s.io      └─ kube-system/kube-apiserver-talos-control-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            0      CONTAINER_CREATED
<node lan ip>   k8s.io      └─ kube-system/kube-apiserver-talos-control-1:kube-apiserver                     k8s.gcr.io/kube-apiserver:v1.24.3            0      CONTAINER_CREATED
<node lan ip>   k8s.io      kube-system/kube-controller-manager-talos-control-1                              k8s.gcr.io/pause:3.6                         3746   SANDBOX_READY
<node lan ip>   k8s.io      └─ kube-system/kube-controller-manager-talos-control-1:kube-controller-manager   k8s.gcr.io/kube-controller-manager:v1.24.3   0      CONTAINER_CREATED
<node lan ip>   k8s.io      └─ kube-system/kube-controller-manager-talos-control-1:kube-controller-manager   k8s.gcr.io/kube-controller-manager:v1.24.3   0      CONTAINER_CREATED
<node lan ip>   k8s.io      kube-system/kube-scheduler-talos-control-1                                       k8s.gcr.io/pause:3.6                         3755   SANDBOX_READY
<node lan ip>   k8s.io      └─ kube-system/kube-scheduler-talos-control-1:kube-scheduler                     k8s.gcr.io/kube-scheduler:v1.24.3            0      CONTAINER_CREATED
<node lan ip>   k8s.io      └─ kube-system/kube-scheduler-talos-control-1:kube-scheduler                     k8s.gcr.io/kube-scheduler:v1.24.3            3831   CONTAINER_RUNNING
$talosctl logs -k kube-system/kube-apiserver-talos-control-1:kube-apiserver   -n <node lan ip> 
<node lan ip>: 2022-08-07T07:52:47.161872935Z stderr F I0807 07:52:47.156803       1 server.go:558] external host was not specified, using <node lan ip>
<node lan ip>: 2022-08-07T07:52:47.181726901Z stderr F I0807 07:52:47.181442       1 server.go:158] Version: v1.24.3
<node lan ip>: 2022-08-07T07:52:47.181827606Z stderr F I0807 07:52:47.181588       1 server.go:160] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
<node lan ip>: 2022-08-07T07:53:10.479681082Z stderr F I0807 07:53:10.479204       1 shared_informer.go:255] Waiting for caches to sync for node_authorizer
<node lan ip>: 2022-08-07T07:53:10.511495166Z stderr F I0807 07:53:10.510255       1 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
<node lan ip>: 2022-08-07T07:53:10.511655094Z stderr F I0807 07:53:10.510407       1 plugins.go:161] Loaded 11 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
<node lan ip>: 2022-08-07T07:53:10.519219468Z stderr F I0807 07:53:10.517658       1 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
<node lan ip>: 2022-08-07T07:53:10.51931595Z stderr F I0807 07:53:10.517754       1 plugins.go:161] Loaded 11 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
<node lan ip>: 2022-08-07T07:53:20.178428779Z stderr F I0807 07:53:20.177957       1 trace.go:205] Trace[1950691004]: "List(recursive=true) etcd3" key:/apiextensions.k8s.io/customresourcedefinitions,resourceVersion:,resourceVersionMatch:,limit:10000,continue: (07-Aug-2022 07:53:13.722) (total time: 6454ms):
<node lan ip>: 2022-08-07T07:53:20.178591225Z stderr F Trace[1950691004]: [6.454550408s] [6.454550408s] END
<node lan ip>: 2022-08-07T07:53:23.474343989Z stderr F W0807 07:53:23.473948       1 genericapiserver.go:557] Skipping API apiextensions.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:23.480096847Z stderr F I0807 07:53:23.479714       1 instance.go:274] Using reconciler: lease
<node lan ip>: 2022-08-07T07:53:25.299999581Z stderr F W0807 07:53:25.299404       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:25.300137934Z stderr F E0807 07:53:25.299563       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:25.767979259Z stderr F I0807 07:53:25.767656       1 instance.go:586] API group "internal.apiserver.k8s.io" is not enabled, skipping.
<node lan ip>: 2022-08-07T07:53:26.308348864Z stderr F W0807 07:53:26.306884       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:26.308539495Z stderr F E0807 07:53:26.308052       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:26.703523002Z stderr F W0807 07:53:26.703231       1 genericapiserver.go:557] Skipping API authentication.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.71040835Z stderr F W0807 07:53:26.710144       1 genericapiserver.go:557] Skipping API authorization.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.768462564Z stderr F W0807 07:53:26.768154       1 genericapiserver.go:557] Skipping API certificates.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.775733711Z stderr F W0807 07:53:26.775450       1 genericapiserver.go:557] Skipping API coordination.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.801308478Z stderr F W0807 07:53:26.801060       1 genericapiserver.go:557] Skipping API networking.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.814456076Z stderr F W0807 07:53:26.814175       1 genericapiserver.go:557] Skipping API node.k8s.io/v1alpha1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.842395472Z stderr F W0807 07:53:26.842087       1 genericapiserver.go:557] Skipping API rbac.authorization.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.842487417Z stderr F W0807 07:53:26.842173       1 genericapiserver.go:557] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.849121059Z stderr F W0807 07:53:26.848850       1 genericapiserver.go:557] Skipping API scheduling.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.849214393Z stderr F W0807 07:53:26.848912       1 genericapiserver.go:557] Skipping API scheduling.k8s.io/v1alpha1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.869198266Z stderr F W0807 07:53:26.868920       1 genericapiserver.go:557] Skipping API storage.k8s.io/v1alpha1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.888548671Z stderr F W0807 07:53:26.888270       1 genericapiserver.go:557] Skipping API flowcontrol.apiserver.k8s.io/v1alpha1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.908957046Z stderr F W0807 07:53:26.908644       1 genericapiserver.go:557] Skipping API apps/v1beta2 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.909069992Z stderr F W0807 07:53:26.908710       1 genericapiserver.go:557] Skipping API apps/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.919660182Z stderr F W0807 07:53:26.919285       1 genericapiserver.go:557] Skipping API admissionregistration.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:26.940761063Z stderr F I0807 07:53:26.940459       1 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
<node lan ip>: 2022-08-07T07:53:26.940850712Z stderr F I0807 07:53:26.940518       1 plugins.go:161] Loaded 11 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
<node lan ip>: 2022-08-07T07:53:27.064850069Z stderr F W0807 07:53:27.064538       1 genericapiserver.go:557] Skipping API apiregistration.k8s.io/v1beta1 because it has no resources.
<node lan ip>: 2022-08-07T07:53:27.312921947Z stderr F W0807 07:53:27.312536       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:27.313027725Z stderr F E0807 07:53:27.312715       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:29.092961319Z stderr F W0807 07:53:29.092583       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:29.093077968Z stderr F E0807 07:53:29.092736       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:30.096509239Z stderr F W0807 07:53:30.096226       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:30.096595536Z stderr F E0807 07:53:30.096330       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:31.101259535Z stderr F W0807 07:53:31.100887       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:31.101636779Z stderr F E0807 07:53:31.101023       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:32.11009708Z stderr F W0807 07:53:32.109723       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:32.110251285Z stderr F E0807 07:53:32.110058       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:32.28315371Z stderr F I0807 07:53:32.282758       1 dynamic_cafile_content.go:157] "Starting controller" name="request-header::/system/secrets/kubernetes/kube-apiserver/aggregator-ca.crt"
<node lan ip>: 2022-08-07T07:53:32.283384878Z stderr F I0807 07:53:32.282759       1 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/system/secrets/kubernetes/kube-apiserver/ca.crt"
<node lan ip>: 2022-08-07T07:53:32.28367264Z stderr F I0807 07:53:32.283444       1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::/system/secrets/kubernetes/kube-apiserver/apiserver.crt::/system/secrets/kubernetes/kube-apiserver/apiserver.key"
<node lan ip>: 2022-08-07T07:53:32.284444256Z stderr F I0807 07:53:32.284262       1 secure_serving.go:210] Serving securely on [::]:6443
<node lan ip>: 2022-08-07T07:53:32.284527183Z stderr F I0807 07:53:32.284414       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
<node lan ip>: 2022-08-07T07:53:32.300401096Z stderr F I0807 07:53:32.300118       1 apiservice_controller.go:97] Starting APIServiceRegistrationController
<node lan ip>: 2022-08-07T07:53:32.300527283Z stderr F I0807 07:53:32.300183       1 cache.go:32] Waiting for caches to sync for APIServiceRegistrationController controller
<node lan ip>: 2022-08-07T07:53:32.30057595Z stderr F I0807 07:53:32.300322       1 apf_controller.go:317] Starting API Priority and Fairness config controller
<node lan ip>: 2022-08-07T07:53:35.491976684Z stderr F I0807 07:53:35.491361       1 controller.go:83] Starting OpenAPI AggregationController
<node lan ip>: 2022-08-07T07:53:35.492467114Z stderr F I0807 07:53:35.491778       1 customresource_discovery_controller.go:209] Starting DiscoveryController
<node lan ip>: 2022-08-07T07:53:35.516713236Z stderr F I0807 07:53:35.515508       1 dynamic_serving_content.go:132] "Starting controller" name="aggregator-proxy-cert::/system/secrets/kubernetes/kube-apiserver/front-proxy-client.crt::/system/secrets/kubernetes/kube-apiserver/front-proxy-client.key"
<node lan ip>: 2022-08-07T07:53:35.523877659Z stderr F E0807 07:53:35.522684       1 authentication.go:63] "Unable to authenticate the request" err="invalid bearer token"
<node lan ip>: 2022-08-07T07:53:35.524209883Z stderr F I0807 07:53:35.523292       1 available_controller.go:491] Starting AvailableConditionController
<node lan ip>: 2022-08-07T07:53:35.524258902Z stderr F I0807 07:53:35.523403       1 cache.go:32] Waiting for caches to sync for AvailableConditionController controller
<node lan ip>: 2022-08-07T07:53:35.531061063Z stderr F I0807 07:53:35.530547       1 autoregister_controller.go:141] Starting autoregister controller
<node lan ip>: 2022-08-07T07:53:35.531200083Z stderr F I0807 07:53:35.530782       1 cache.go:32] Waiting for caches to sync for autoregister controller
<node lan ip>: 2022-08-07T07:53:35.531251842Z stderr F I0807 07:53:35.531107       1 controller.go:80] Starting OpenAPI V3 AggregationController
<node lan ip>: 2022-08-07T07:53:35.536671623Z stderr F I0807 07:53:35.536302       1 cluster_authentication_trust_controller.go:440] Starting cluster_authentication_trust_controller controller
<node lan ip>: 2022-08-07T07:53:35.558643376Z stderr F I0807 07:53:35.558207       1 shared_informer.go:255] Waiting for caches to sync for cluster_authentication_trust_controller
<node lan ip>: 2022-08-07T07:53:35.564877903Z stderr F W0807 07:53:35.564465       1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:35.564981941Z stderr F E0807 07:53:35.564553       1 cacher.go:425] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input; reinitializing...
<node lan ip>: 2022-08-07T07:53:35.659126019Z stderr F I0807 07:53:35.658798       1 shared_informer.go:262] Caches are synced for cluster_authentication_trust_controller
<node lan ip>: 2022-08-07T07:53:35.775583557Z stderr F W0807 07:53:35.775282       1 reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.Secret: Internal error occurred: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:35.775665372Z stderr F E0807 07:53:35.775399       1 reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Secret: failed to list *v1.Secret: Internal error occurred: unable to transform key "/registry/secrets/kube-system/bootstrap-token-yvhv5k": invalid padding on input
<node lan ip>: 2022-08-07T07:53:35.795214256Z stderr F I0807 07:53:35.795014       1 shared_informer.go:262] Caches are synced for node_authorizer
<node lan ip>: 2022-08-07T07:53:37.153734879Z stderr F I0807 07:53:37.153471       1 cache.go:39] Caches are synced for AvailableConditionController controller
<node lan ip>: 2022-08-07T07:53:37.155517559Z stderr F I0807 07:53:37.155219       1 controller.go:132] OpenAPI AggregationController: action for item k8s_internal_local_delegation_chain_0000000000: Nothing (removed from the queue).
<node lan ip>: 2022-08-07T07:53:37.159149104Z stderr F I0807 07:53:37.157411       1 apf_controller.go:322] Running API Priority and Fairness config worker
<node lan ip>: 2022-08-07T07:53:37.15921866Z stderr F I0807 07:53:37.157713       1 controller.go:85] Starting OpenAPI controller
<node lan ip>: 2022-08-07T07:53:37.159230271Z stderr F I0807 07:53:37.158034       1 controller.go:85] Starting OpenAPI V3 controller
<node lan ip>: 2022-08-07T07:53:37.159240197Z stderr F I0807 07:53:37.158143       1 naming_controller.go:291] Starting NamingConditionController
<node lan ip>: 2022-08-07T07:53:37.206799173Z stderr F I0807 07:53:37.206378       1 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/system/secrets/kubernetes/kube-apiserver/ca.crt"
<node lan ip>: 2022-08-07T07:53:37.226547224Z stderr F I0807 07:53:37.226268       1 dynamic_cafile_content.go:157] "Starting controller" name="request-header::/system/secrets/kubernetes/kube-apiserver/aggregator-ca.crt"
<node lan ip>: 2022-08-07T07:53:37.228822889Z stderr F I0807 07:53:37.228589       1 cache.go:39] Caches are synced for autoregister controller
<node lan ip>: 2022-08-07T07:53:40.310974153Z stderr F I0807 07:53:40.310669       1 trace.go:205] Trace[611352892]: "GuaranteedUpdate etcd3" type:*core.RangeAllocation (07-Aug-2022 07:53:37.235) (total time: 3074ms):
<node lan ip>: 2022-08-07T07:53:40.311083858Z stderr F Trace[611352892]: ---"initial value restored" 3074ms (07:53:40.310)
<node lan ip>: 2022-08-07T07:53:40.31110158Z stderr F Trace[611352892]: [3.074968194s] [3.

during the talosctl apply-config stage itself, if I set the control plane ip to localhost, the node becomes unreachable via talosctl. maybe this is expected behaviour.