siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.45k stars 514 forks source link

Build-in load balancer #5134

Open SixFive7 opened 2 years ago

SixFive7 commented 2 years ago

Feature Request

Description

Load balancers are ubiquitous in cloud environments but not standardized and manual work in on-premise setups. Hence letting Talos handle this requirement internally would relieve a significant burden in setting up on premise clusters with Talos. See #2711 for a partial rationale.

Current solutions

Currently (v0.14) the available solutions in Talos are:

Suggested additional solution

A better solution would be to add Talos native load balancing on the node side. This would:

Possible implementation

Possible usage

Instead of running

talosctl gen config <cluster name> <cluster endpoint>

One would specify three <cluster endpoints>. Instead of running

talosctl gen config <cluster name> <endpoint1>,<endpoint2>,<endpoint3>

This would result in the actual cluster.controlPlane.endpoint to be set to https://127.0.0.1:someport with a native local load balancer behind it balancing all requests to all three actual endpoints. All three endpoints would of course still be DNS names so that no worker config would need to be changed if a control plane ever changes its IP.

Ecosystem

Not accidentally more people have run into this speedbump setting up an on premise cluster. In fact, there is a large thread for just this: don't require a load balancer between cluster and control plane and still be HA and some partial fixes: KEP-3037: client-go alternative services. However, as there are many moving parts it will probably take a long time to get support for multiple API endpoints natively. Furthermore it will taken even more years before all components of all CNI's have support for this. Hence it would be better to build it into Talos now and, as a feature of Talos, remove the load balancer requirement.

As an example solution there is Rancer's RKE implementation. Their solution by default:

Some related thoughts

Current workaround

For anybody who wants a similar solution right now you can:

Click to expand! ### Instructions - Determine your API server fqdn but set a different port than 6443. For example `https://kube.mycluster.mydomain.com:6444`. - Add this fqdn to the host file. ```yaml machine: network: extraHostEntries: - ip: 127.0.0.1 aliases: - kube.mycluster.mydomain.com ``` This way it always resolves to localhost on the worker and control plane nodes but not on any external services that have somehow received that fqdn config. - Add a static HAProxy pod to run a mini load balancer on every node. ```yaml machine: files: - path: /etc/kubernetes/manifests/kubernetes-api-haproxy.yaml permissions: 0o666 op: create content: | apiVersion: v1 kind: Pod metadata: name: kubernetes-api-haproxy namespace: kube-system spec: hostNetwork: true containers: - name: kubernetes-api-haproxy image: haproxy livenessProbe: httpGet: host: localhost path: /livez port: 6445 scheme: HTTP volumeMounts: - name: kubernetes-api-haproxy-config mountPath: /usr/local/etc/haproxy/haproxy.cfg readOnly: true volumes: - name: kubernetes-api-haproxy-config hostPath: path: /etc/kubernetes/manifests/haproxy.cfg type: File - path: /etc/kubernetes/manifests/haproxy.cfg permissions: 0o666 op: create content: | global log stdout format raw daemon defaults log global option tcplog option http-keep-alive timeout connect 3s timeout client 1h timeout server 1h timeout tunnel 1h timeout client-fin 1m timeout server-fin 1m retries 1 email-alert mailers mailservers email-alert from kube1-kubernetes-api-haproxy@your.mailserver.com email-alert to kube1-kubernetes-api-haproxy@your.mailserver.com email-alert level notice mailers mailservers mailer yourdomaintld your.mailserver.com:25 frontend kube mode tcp bind :6444 default_backend kubes backend kubes mode tcp balance roundrobin option httpchk GET /readyz http-check expect status 200 default-server verify none check check-ssl inter 2s fall 2 rise 2 server kube1 kube1.mycluster.mydomain.com:6443 server kube2 kube1.mycluster.mydomain.com:6443 server kube3 kube1.mycluster.mydomain.com:6443 frontend stats mode http bind :6445 monitor-uri /livez default_backend stats backend stats mode http stats refresh 5s stats show-node stats show-legends stats show-modules stats hide-version stats uri / ``` Note that: - As of v0.15/v1.0 there is now a better way to add a static pod to the Talos config. - The haporxy.cfg file is placed in the wrong directory resulting in an (otherwise harmless) error message on boot. - You could define a different email alert sender per node or not. - Check on every node's 6445 port for its view on API availability. - Do not forget to set the A records of the fqdn for external visitors to all three control plane nodes (or a load balancer running on kubernetes if you need HA from an external client as well).
smira commented 2 years ago

by the way Talos runs Kubernetes control plane components pointed to the localhost:6443 for the API server endpoint, so they don't require load-balancer to be up

edude03 commented 2 years ago

I'm interested in this as well. Mostly for HA control plane without an external LB.

SixFive7 commented 1 year ago

I just noticed the https://github.com/siderolabs/talos/releases/tag/v1.5.0-alpha.1 patch notes. Some exiting news in the Kubernetes API Server In-Cluster Load Balancer section. Looking forward to seeing how complete this build in load balancer is going to be. Especially curious if it can also function as a load balancer for external clients and if monitoring can be included. Good to see Talos OS growing. Exiting times!

smira commented 1 year ago

I just noticed the https://github.com/siderolabs/talos/releases/tag/v1.5.0-alpha.1 patch notes. Some exiting news in the Kubernetes API Server In-Cluster Load Balancer section. Looking forward to seeing how complete this build in load balancer is going to be. Especially curious if it can also function as a load balancer for external clients and if monitoring can be included. Good to see Talos OS growing. Exiting times!

This feature is in-cluster exclusively, it makes sure the cluster can run even if the external load-balancer is down (or it might prefer local traffic if the external load-balancer has higher latency).

github-actions[bot] commented 2 months ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

SixFive7 commented 2 months ago

I'd still be very much interested in a build in load balancer to replace our bespoke haproxy static pod setup.

smira commented 2 months ago

Talos supports KubePrism for internal load-balancing.

SixFive7 commented 2 months ago

Yes, and KubePrism is excellent. I thoroughly enjoyed reading about its motivation and design.

However, we are still left with all the tradeoffs on the load balancer for our external kubernetes API access. For us this means we need to maintain a bespoke haproxy setup (implemented as static pods that run before the cluster is online) on our otherwise rather feature complete Talos cluster.

smira commented 2 months ago

It just feels that external access is much more complicated, and it shouldn't be solved this way. Talos Linux and Kubernetes itself with KubePrism enabled doesn't depend on the load-balancer anymore.

External access can be provided by running load-balancers outside the cluster, using simple round-robin DNS, using a direct endpoint of the controlplane node, etc. There are many options and all of them work equally well.

I don't quite see how running haproxy on the machine itself might help here, as anyways if the machine goes down, the access will no longer be possible.