Closed HomayoonAlimohammadi closed 2 months ago
right, I didn't think about that. what if we run the csrsigning controller in onBootstrap
hook? @neoaggelos
Perhaps an alternative would be to block the csrsigning controller starting before we can validate that we have a proper kubeconfig in place
For example, we could return the rest config from here https://github.com/canonical/k8s-snap/blob/2994b879eab39b782556f122a2c8512bf85e9aca/src/k8s/pkg/k8sd/controllers/csrsigning/controller.go#L46 only if we can successfully call an endpoint (e.g. /readyz
or a GetNode()
), otherwise keep looping.
e.g. here is what we do on bootstrap (without the loop, as we don't want to keep using stale kubeconfigs) https://github.com/canonical/k8s-snap/blob/db5015ec88663f107e40accd0d36ed7c082991f1/src/k8s/pkg/client/kubernetes/status.go#L17-L19
adding a centralized wait didn't seem to be possible. in order to mark the node as ready onStart
should finish so we couldn't wait there. I think adding the k8s check on markNodeReady is the best way to go.
Summary
onStart
hook happens beforeonBootstrap
. because of this, on non-fresh machines (non-fresh ==/etc/kubernetes/admin.conf
is available) we use invalid/oldadmin.conf
kubeconfig for csrsigining controller client. this PR makes sure that we prevent running controllers until we have the correct.conf
files and we can reach the k8s cluster.How to test
build and install k8s on a fresh machine, run
bootstrap
and check logs and confirm the csrsigning controller is running, e.g.:also confirm that there are kubeconfigs available in
/etc/kubernetes/
, specificallyadmin.conf
. now remove the k8s snap and reinstall k8s (same snap). Runbootstrap
and like above confirm that csrsigning controller is started and running.