gravitational / planet

Installable Kubernetes delivered in containers
Apache License 2.0
51 stars 18 forks source link

Implement Kubernetes HA mode #821

Closed bernardjkim closed 3 years ago

bernardjkim commented 3 years ago

Description

This PR enables Kubernetes HA mode.

A --high-availability boolean flag has been added to the planet start and agent commands. Default value will be read from container-environment variable KUBE_HIGH_AVAILABILITY.

If planet is running in HA mode, Kubernetes control plane components(apiserver, controller-manager, scheduler) will run on all master nodes vs running only on the elected leader.

a-palchikov commented 3 years ago

One thing I did not notice is how the failed api server handled on the leader node (according to agent leader status)? Will it cause failover of the agent leader and updates to coredns.hosts file?

bernardjkim commented 3 years ago

@a-palchikov If I understood your question correctly, there doesn't seem to be any failover in place if the apiserver fails. Would it make sense to watch the apiserver on the leader and run re-election in case the apiserver fails?

a-palchikov commented 3 years ago

@bernardjkim Yes, I see this as a possible weak link. In case the apiserver fails on the leading agent node in HA mode, there will be no DNS update as long as the leader stays the same.

bernardjkim commented 3 years ago

I'll go ahead and merge this PR for now. I'll work on implementing apiserver failover in a separate PR.