xetys / hetzner-kube

A CLI tool for provisioning kubernetes clusters on Hetzner Cloud
Apache License 2.0
746 stars 116 forks source link

Scheduling priorities & self-hosted kubernetes #69

Open Baughn opened 6 years ago

Baughn commented 6 years ago

Here's a puzzler: What happens if, for whatever reason, kube-dns fails to schedule due to overall lack of CPU?

If your answer is "everything breaks", you'd be right. :)

There's a solution hinted at in https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/, namely enabling priorities and marking all the kube-system pods as critical, especially the ones required for the cluster to keep working. I don't think there's a way to mark an entire namespace as high-priority, but certainly these particular pods should be at maximum priority.

pierreozoux commented 6 years ago

Where and how is it scheduled, I don't see it in the code?

Indeed, kube-dns is a tricky part, I had some troubles in production with already (mainly latency...). For instance, openSHift does special tricks, they configure kube-dns as deamon-set and configure the pods to use local dns. we could make sure it has the right priority, and is replicated.

But I'd say, let's not over engineer it, let's have sane default for beginners, but specialist will always have different ways.

Baughn commented 6 years ago

Then, from my beginner's perspective:

While doing something totally unrelated, and presumably due to rolling machine reboots, I had every instance of kube-dns de-schedule. Subsequently the cluster seemed unrecoverable, or at least would have been difficult enough to recover that I decided I might as well delete and rebuild it, since I didn't have anything important there yet.

I really think these particular services should be hard to kill.

pierreozoux commented 6 years ago

As recommended here:

https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/kube-dns.yaml.sed

We could scale this to 2 replicas, and add autoscaling, this would be a same default I think.