Open bbigras opened 2 hours ago
You just need to dig into that further to understand what's wrong, might be unrelated to NixOS.
Fetch kubeconfig
using talosctl kubeconfig
, and after that do kubectl get pods
and figure out why CoreDNS is not ready. Nothing Talos-specific here, just regular Kubernetes troubleshooting.
You can access Kubernetes API as soon as waiting for all k8s nodes to report ready: OK
check completes (you can ^C the cluster create, is just runs the health check and doesn't do anything).
Warning FailedScheduling 2m37s default-scheduler 0/2 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/disk-pressure: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
This is a bit of the way Kubernetes works mixed with the way Docker works.
In the end your Kubernetes in a docker sees a diskfree from the host partition where docker directory is. So if that partition is low on disk space (overall), it would stop scheduling pods. (As the host partition is checked for percent free, it might need way more than it actually needs).
Make sure you have enough space, and you should be good!
Bug Report
Description
I don't seem able to create a cluster with
talosctl cluster create
every single time. Sometimes it works after a reboot, but maybe it's just random.It seems to fail before the default 20m0s timeout from
--wait-timeout duration
.Logs
Environment