Open awh opened 7 years ago
From @zilman on September 8, 2016 15:47
So another concern I didn't mention when we discussed this, as follows:
There's a window where nodes are marked 'ready' but the Weave overlay might not be operational yet either cluster wide because consensus has yet to be achieved or the Weave DaemonSet hasn't been deployed fully on a node yet.
I think at the very least there should be a Service defined with a readiness check that only flips on as appropriate. Otherwise a user might end up deploying things unto a cluster in an incoherent state.
P.S. - The ideal would be of course to hook into the node lifecycle.
From @bboreham on September 8, 2016 15:53
OK; that's really a separate issue. I don't think the state can be temporarily incoherent, but it can be that the network is not ready yet, and k8s will retry periodically for all pods that are supposed to be attached.
A readiness check for "ipam ready" should be straightforward to add.
From @zilman on September 8, 2016 16:12
Cluster of initial size 10.
Nodes come online, become 'Ready', all receive the Weave DaemonSet, quorum is reached, ipam ready check is ok, Weave Service is ready. Conditionally on that we can deploy things into the cluster now, great, better than before.
We go up to 11, as it were. Node11 is 'Ready', receives a pod, Job, or anything else, prior to the Weave DaemonSet. Pod starts without networking being ready, unexpected behavior ensues.
Shortly thereafter the networking will start working on that Node (and for the pod hopefully) but in the meantime? (Yes, most things I can think of would be resilient to that, at worst fail and retry, but it seems brittle)
Also, what about things that were not written with our Weave service in mind and don't know to rely on that readiness check?
From @bboreham on September 9, 2016 8:44
Pod starts without networking being ready
This can't happen. If the Weave daemonset hasn't installed the CNI config file yet, Kubelet will refuse to create the pod.
If it has, Kubelet fires the Weave CNI plugin; either it succeeds or it waits or it fails. If it fails kubelet will destroy the pod and try again later.
From @zilman on September 9, 2016 9:29
Ah, aces! That's exactly how it should be, didn't know that plumbing is there.
From @bboreham on September 8, 2016 11:10
Currently we have a bit of a risk of cliques forming at startup.
Suppose you fire up 10 nodes A, B, ... J. When A runs
kube-peers
it gets back A, B, and when J runskube-peers
it gets back all 10. It is possible (though rare) for A to form consensus with B, and for J to form consensus with E, F, G, H and I.If the user can configure their expectation that there will be 10 nodes, we can pass this through and avoid the A, B clique.
Copied from original issue: weaveworks/weave-kube#2