Can we use a single Kubernetes cluster for both MacStadium pods?

mjm commented 5 years ago

Right now, we only have a Kubernetes cluster running in the pod-1 datacenter for MacStadium. Eventually, we will want to bring up workers for pod-2, and when we do, we need to think about where those workers will run.

One option, the one I had been running with until now, was that we would need two clusters, one running in pod-1 and one running in pod-2. It certainly more closely mirrors what we've been doing pre-Kubernetes. Running a cluster adds maintenance burden, though, and we already have two: production and staging.

The other option would be to run a single cluster for production, with nodes spread across both datacenters. This should work: the internal network is shared between both datacenters, so the cluster nodes should be able to communicate.

It would still be important that there are separate worker/jupiter-brain deployments for each pod: worker needs to be on a node on the right Jobs network, and jupiter-brain needs to talk to the right vSphere instance. We can manage this by putting a label on the nodes in each cluster, and configuring the deployments to only run on nodes with that label.

Right now, the cluster has a single master node. Technically, it still could, but the cluster would be more fault-tolerant if it had multiple masters: specifically, at least three. Kubernetes using etcd for data storage, and its availability is based on that. etcd is only highly available with an odd number of at least 3 masters. As long as it has half + 1 of the masters online, it can have consensus and keep functioning as normal. So 3 masters would let us lose a single master without missing a beat.

Unfortunately, there's no way to have two pods such that losing an entire datacenter isn't disruptive to the cluster. There would always be one pod with the majority of the masters, such that losing them all at once would prevent having consensus. It wouldn't stop workers from running on the nodes, but it would prevent making changes to the cluster until the additional masters were restored. The only way we could solve this problem is adding a third datacenter, but that seems excessive.

Even still, a benefit to having multiple masters is the ability to bring one down for system upgrades without disruption.

All of this may not be super relevant at the moment: kubeadm's support for multiple masters in a cluster is in alpha right now, so we probably don't want to try to use it until it's a little more stable.

mjm commented 5 years ago

With the release of Kubernetes 1.13, I no longer see the alpha labeling in the docs around setting up multi-master clusters with kubeadm. It's still described as experimental, but I don't think it's off the table to do such a setup now.

mjm commented 5 years ago

One thing to consider here is that to have multiple masters, we will need a load balancer to balance traffic between them and to provide a single address to connect to. This can be something as simple as an HAProxy instance. But that instance has to be hosted somewhere within our MacStadium network to be able to route the traffic. So that still means a single point of failure.

mjm commented 5 years ago

Also wanted to add a link for the Kubernetes docs on highly-available clusters with kubeadm.

travis-ci / kubernetes-config

Can we use a single Kubernetes cluster for both MacStadium pods? #24