Interested in supporting provisioning of single-master clusters via kubeadm?

emmanuel commented 6 years ago

Feature Request

First, let me say that I learned quite a bit from Typhoon and I appreciate your effort in creating, documenting, and maintaining it. I consider it to be one of the most refined and production-ready Kubernetes provisioning tools available (of which there are many!). Specifically, its clean separation of provisioning logic from infrastructure state enables comfortably managing a variety of long-lived clusters with safe control of feature-rollout to each.

Second, this is more of an inquiry (and an offer) than a request.

I have recently implemented a Typhoon-inspired kubeadm-based provisioning set of Terraform modules. I've got GCP working, and AWS on the way. I have used Ubuntu as the OS, but I believe it should be (fairly) straight-foward to port to ContainerLinux (and presumably Fedora Atomic, though I'm not (yet) familiar with it).

Would you be interested in a PR to Typhoon to implement kubeadm-based provisioning as an alternative strategy to the current bootkube approach?

This would be for single-master clusters, as kubeadm HA support is experimental (I haven't touched it).

Feature

With Typhoon, bootkube can be used to provision single-master (non-HA) clusters, in principle. However, as far as I understand things, in practice the act of reconfiguring a self-hosted control plane on a single-master is effectively not possible (at least not via the API; though perhaps I'm mistaken here).

By contrast, kubeadm-based clusters sacrifice some of the flexibility (and elegance) of self-hosting in favor of simplicity. Reconfiguration of the control plane consists of SSHing into the master and updating the control plane manifests in /etc/kubernetes/manifests, which relies on the local kubelet manifest support and avoids several of the (potentially complex) interactions of a self-hosted control plane.

Tradeoffs

The obvious downside to this feature is the increased project surface area, which would be significant. This would be especially acute if support is provided across the matrix of cloud providers and OS distros. Maintenance would thus be the driving consideration (far more than initial implementation effort). I would not advocate this option if there is not already demand for single-master clusters.

The upside would be the expansion of the use-cases that Typhoon can address. Given the exclusive use of bootkube for the control plane, I consider Typhoon HA-only. This is a reasonable choice for a variety of circumstances, but there are many valid cases where a single-master control plane is desirable, and I believe kubeadm is a better fit than bootkube for those cases.

Please let me know what you think. I'd be happy to contribute what I have if you are interested.

dghubble commented 6 years ago

To clear up any misconceptions, single-master self-hosted clusters are quite reasonable and well-supported. They're not some theoretical thing. There are likely more single-master clusters deployed than multi-master clusters. And rightly so, for many real use-cases and clusters under 10 nodes, single master can be a better choice. You can power cycle the master or update the control plane if you wish. Of course, the API server is unavailable while powered off, but in single-master cases you're cool with that. Checkpointing restores the control plane from scratch on fresh boots and multiple replicas of the controller manager and scheduler are run so edits can be done in-place.

Typhoon clusters already provide single-master and multi-master. I'm glad folks try various tools and thanks for your work, but this wouldn't provide value that's not already supported.

dghubble commented 6 years ago

One followup. Typhoon recommends blue/green cluster replacement for logistical reasons (its just easier, cleaner, and safer to make a new cluster, it takes mere minutes). However, self-hosted clusters were originally designed to support in-place edits of control plane manifests if you're willing to apply those changes. Single-master clusters too. We just choose to advise against it. https://typhoon.psdn.io/topics/maintenance/#in-place-edits

dghubble commented 6 years ago

@emmanuel I do really appreciate you laying out your thought process and tradeoffs. Hope your adventures on Ubuntu go well.

emmanuel commented 6 years ago

@dghubble thanks for the feedback.

As a practical matter, do in-place control plane configuration updates work with self-hosting?

I’m specifically wondering about apiserver config changes because I’ve experienced problems with in-place apiserver updates of HA (3 and 5 node) self-hosted clusters. The controller managers pinned to specific apiserver replica (because of the kube-proxy managed iptables rules when accessing via kubernetes.default.svc.cluster.local), and the control plane would sometimes get wedged during updates.

On the other hand, if in-place updates are not recommended, then what is the benefit of self-hosting? In my mind, self-hosting introduces complexity that pays for itself by enabling/easing in-place updates. Or is it more the case that in-place updates are not recommended for typhoon users because there are inherent risks and the potential support burden is too high?

emmanuel commented 6 years ago

Also, for the record, my heart is with Container Linux. I am working with Ubuntu because of external constraints.

That said, thanks for the good wishes @dghubble, and thanks again for Typhoon.

poseidon / typhoon