kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.52k stars 1.3k forks source link

Separate client vs provider APIEndpoints fields #1197

Closed detiber closed 4 years ago

detiber commented 5 years ago

/kind feature

Describe the solution you'd like Currently both external consumers of Cluster API and internal providers can use the existing APIEndpoints field in different ways. I'd like to propose that we separate these uses into at least two separate fields, so that each field has a single purpose and the semantics of how and when to consume the fields are well known.

External consumers are looking for an endpoint to use to pass to client tooling or a client SDK, which only accepts a single endpoint. In most cases implementations here just take the head of the API Endpoints list. This is currently the behavior used by clusterctl and https://github.com/vmware/cluster-api-upgrade-tool for example.

Internal provider use cases varies by provider. The AWS provider populates this value with the Load Balancer address that is created as part of Cluster instantiation while some other providers use this to track endpoints for each Control Plane Machine for the use of populating a Load Balancer or VIP configuration that does not exist prior to Machines being created.

The dual use of field this causes quite a few potential pain points.

detiber commented 5 years ago

/cc @ncdc @moshloop @akutz

detiber commented 5 years ago

I'd also like to target this change for v1alpha2 /assign @vincepri

detiber commented 5 years ago

/cc @davidewatson @dlipovetsky

moshloop commented 5 years ago

I think splitting them into Internal and External is a good option.

Client use can and should expect a stable IP/DNS name that will not change over the life of the Cluster.

Yes, they SHOULD, not MUST - We still need to cater for scenarios where it is not possible

Advertised and accepted addresses/names for an API server need to be known in advance of configuration so certificates can be generated properly.

With a client-side load balancer it is always localhost

In the case of multiple endpoints and rolling control plane upgrade scenarios, the cached set of API endpoints the client is using may not be valid part way through the upgrade, or the current endpoint may suddenly drop connection out from underneath the client.

If the APIEndpoints respect "cordoned" masters, then this shouldn't be an issue - When using a single endpoint, if that endpoint fails, or is upgraded then clients are left unable to connect at all

akutz commented 5 years ago

Without HA and a LB you can’t support this. At best you can only provide a deterministic means if selecting a control plane node, which CAPV selects as the oldest member of the control plane.

It doesn’t matter what MUST occur, but rather what the cluster and its operators design. If they design a cluster with no external HA and multiple control plane nodes, then it’s possible those nodes don’t stay static, and one is ejected, and if that’s the oldest then the cluster is unreachable.

moshloop commented 5 years ago

Without HA and a LB you can’t support this. At best you can only provide a deterministic means if > selecting a control plane node, which CAPV selects as the oldest member of the control plane.

Which part are you referring to that can't be supported?

It doesn’t matter what MUST occur, but rather what the cluster and its operators design

How do you plan on designing this on-premise (without an ELB on VMC) Most CAPV users will be on-premise, where a client-side load balancer makes the most sense.

detiber commented 5 years ago

@moshloop you keep mentioning a client-side load balancer. How would that work with external clients and integrations (such as Jenkins, etc)?

moshloop commented 5 years ago

DNS round-robin would be suitable for most external clients. If APIEndpoints are the source of truth, then different out of band controllers (client-side, dns, F5, etc..) can be created that work across CAPI providers.

A client-side load balancer could also be used for bootstrapping, and then pivot to a real load balancer once the load balancer is healthy via removal of a iptables rule or /etc/hosts entry

vincepri commented 5 years ago

/area api /priority important-soon /milestone v1alpha2

vincepri commented 5 years ago

Have we reached some consensus around this? I'd prefer to move to a single endpoint and have additional ones in a different slice.

moshloop commented 5 years ago

How About:

// APIEndpoints refers to all valid/healthy API Endpoints at their InternalIP address
APIEndpoints  []string

// ClientEndpoint refers to a URL or IP that load balances across all APIEndpoints, 
// or a random healthy APIEndpoint if no load balancer is available
ClientEndpoint string
detiber commented 5 years ago

@moshloop I'm good with that proposed suggestion, but I would remove the limitation that APIEndpoints refer to InternalIP, since it might be implementation dependent on how they would be consumed by what would be providing the ClientEndpoint. (For example DNS-based config would require Public IPs)

vincepri commented 5 years ago

We should keep using a struct type for endpoints. I'd propose the following:

// AdditionalAPIEndpoints refers to all valid/healthy API Endpoints to communicate with the control plane.
AdditionalAPIEndpoints  []APIEndpoint `json:"additionalApiEndpoints,omitempty"`

// APIEndpoint refers to a URL or IP that load balances a control plane,
// or a random healthy endpoint if no load balancer is available.
APIEndpoint APIEndpoint `json:"apiEndpoint,omitempty"`
akutz commented 5 years ago

Hi @vincepri,

I truly don't see how the above proposal is any different than having a single slice like there is today. This appears to add an additional field for no real reason other than avoiding accessing the first element of the original slice.

detiber commented 5 years ago

@akutz it provides a single canonical source for generating things like a kubeconfig, rather than just randomly taking the first field from a slice that is used for various different reasons.

This gives us a way to not have to interpret provider intent when creating the client for setting NodeRefs or generating the kubeconfig secret for users.

vincepri commented 5 years ago

We've seen lots of folks using a single API endpoint, some others like @moshloop have use cases to use more than one, which is ok. This proposal would simplify the 80% use case and allow for others as well.

akutz commented 5 years ago

just randomly taking the first field from a slice that is used for various different reasons.

Except that is what the doc on the single field now says short of an LB:

a random healthy endpoint if no load balancer is available.

So to me this just separates things into two fields when all that's required is more explicit documentation on the existing field.

moshloop commented 5 years ago

a random healthy endpoint if no load balancer is available.

So this is intentionally random rather than just random - in theory if you have a controller looping trying to get to a ControlPlaneReady state, and you use the same endpoint (e.g. first) and that endpoint happens to be down, then it will never reconcile, even though it could potentially reconcile with a random endpoint.

  1. Pick a random APIEndpoint
  2. Set it as the ClientEndpoint
  3. Attempt to connect to cluster 3a. If successful set to ControlPlaneReady == true 3b. If not, repeat at step 1.

I am going to document this in the Load Balancer Provider CAEP

dlipovetsky commented 5 years ago

I agree with the premises. Given that we are modifying the API used by all providers, I think good usability should also be a goal: both human (i.e. reading/editing the manifest), and programmatic.

Have we considered the approach taken by corev1.NodeAddress?

For example:

type APIEndpointType string

const (
    ExternalAPIEndpoint APIEndpointType = "ExternalEndpoint"
    InternalAPIEndpoint APIEndpointType = "InternalEndpoint"
)

type APIEndpoint struct {
    Type    APIEndpointType
    Host    string
        Port    int
}

With this approach, we can assign explicit meanings to API Endpoints (ExternalEndpoint, InternalEndpoint, etc). And we can implement helper functions that provide the right semantics (e.g. set semantics) that all providers can use.

dlipovetsky commented 5 years ago

Does anyone use an authenticating proxy?

If I did, I might want to list it under APIEndpoints.

(I'm not sure if it merits its own type, e.g. AuthProxyAPIEndpoint APIEndpointType = "AuthProxyEndpoint")

vincepri commented 4 years ago

@moshloop Are you ok closing this in favor or https://github.com/kubernetes-sigs/cluster-api/issues/1687 ?

moshloop commented 4 years ago

/close

Multiple endpoints become possible with Machine Load Balancers, but only the control plane endpoint which other masters/workers connect to is stored on the cluster object.

k8s-ci-robot commented 4 years ago

@moshloop: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/1197#issuecomment-556250590): >/close > >Multiple endpoints become possible with Machine Load Balancers, but only the control plane endpoint which other masters/workers connect to is stored on the cluster object. > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.