metal3-io / cluster-api-provider-metal3

Metal³ integration with https://github.com/kubernetes-sigs/cluster-api
Apache License 2.0
211 stars 95 forks source link

Provide a solution for control-plane load-balancing #38

Closed maelk closed 4 years ago

maelk commented 5 years ago

This issue is related to https://github.com/metal3-io/cluster-api-provider-baremetal/issues/101. When provisioning a cluster, we need to set up a load-balancer for the control-plane, since we may not be able to rely on a cloud-provider load-balancer. With this issue, we will go through the different alternatives to set up an HA cluster on baremetal nodes without external load-balancer.

We will start with a design document opened to discussion.

maelk commented 5 years ago

Here is a first draft document with the scope and goals [WIP] : https://docs.google.com/document/d/1LLywou05ak7df1VYQns27k70s6NgA8I-fILuJPDJZfI

russellb commented 5 years ago

We've done something for this in OpenShift. Roughly the idea is that we bring up a cluster with a self-hosted load balancer and allow configuration of an external load balancer post-install if desired.

The self hosted load balancer solution for the API includes keepalived and haproxy. First we require a virtual IP (VIP) to be allocated for this purpose. The VIP is managed by keepalived across the masters. haproxy is also running on all of the masters. API traffic goes to the host where the VIP currently resides, and is redirected to one of the active hosts by haproxy.

russellb commented 5 years ago

@celebdor FYI - in case you'd like to add some more input here

celebdor commented 5 years ago

That is a very good description of what we do, @russellb . @maelk Let me know if you want to discuss that in more detail.

maelk commented 5 years ago

@celebdor yes I would be very interested in discussing it, to know how you did it in details and see if it is applicable for us.

stbenjam commented 4 years ago

/label priority/backlog

russellb commented 4 years ago

/label kind/feature

russellb commented 4 years ago

/kind feature

russellb commented 4 years ago

/priority backlog

metal3-io-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

maelk commented 4 years ago

/remove-lifecycle stale

metal3-io-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

metal3-io-bot commented 4 years ago

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

metal3-io-bot commented 4 years ago

@metal3-io-bot: Closing this issue.

In response to [this](https://github.com/metal3-io/cluster-api-provider-metal3/issues/38#issuecomment-713111653): >Stale issues close after 30d of inactivity. Reopen the issue with `/reopen`. Mark the issue as fresh with `/remove-lifecycle stale`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ileixe commented 1 year ago

This is one of the missing parts to migrate our manual cluster to use cluster-api. I saw there is a way to use VIP using keepalived, but we need proper loadbalancing. (We've used kube-vip for this).

Here is a first draft document with the scope and goals [WIP] : https://docs.google.com/document/d/1LLywou05ak7df1VYQns27k70s6NgA8I-fILuJPDJZfI

This doc mentioned BGP based loadbalancing that I want. Is there any progress for the approach?