Use a server group to ensure anti-affinity for control plane nodes

mkjpryor commented 2 years ago

/kind feature

Describe the solution you'd like We should put control plane nodes in a server group with the soft anti-affinity policy to decrease the risk that a hypervisor failure takes out the whole control plane.

This becomes more important once #1252 is merged, but is something that I think we should do even when using explicit AZs.

Anything else you would like to add:

jichenjc commented 2 years ago

I knew OCP seems has such by default, are you suggesting we should default it in our template or in the code directly?

https://github.com/openshift/installer/blob/master/docs/user/openstack/README.md#using-a-server-group

mkjpryor commented 2 years ago

@jichenjc

IMHO, it would be nice if we could manage a server group for the control plane nodes. It would be better for the "Infrastructure-as-Code" approach to have the whole thing defined in Kubernetes resources, rather than having to make the server group ahead of time.

mkjpryor commented 2 years ago

We should probably at least document it in the CAPO book as well.

jichenjc commented 2 years ago

yes, we have such definition in api/v1alpha5/openstackmachine_types.go

// The server group to assign the machine to
        ServerGroupID string `json:"serverGroupID,omitempty"`

I think maybe we can refer to other impl, like sec group we did, a bool var to control whether we create for customer if the bool is true, we create server group and honor the ID to each control plane node if it's false, let customer decide whether they need such .. and during deletion phase, delete the server group all together at last

mdbooth commented 2 years ago

For the control plane we certainly have the opportunity to create and manage this server group: the server group will have the same lifecycle as the cluster, so it can be managed with the cluster.

For machines in a machineset it is not currently possible to do this in CAPI as far as I'm aware. I have previously mooted the possibility of some CAPI object whose lifecycle spans multiple machines. The best suggestion so far is something along the lines of MachinePools, however this would be an abuse of this interface IMHO as OpenStack has no such concept. An object with the same lifecycle as a MachineSet would probably be sufficient: you'd need a new machineset for every server group, but this doesn't sound unreasonable to me.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

tobiasgiese commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 year ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1256#issuecomment-1417429131): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

dalees commented 1 year ago

I'd like to re-open this issue and help implement this.

We are a cloud provider working on implementing OpenStack Magnum managed CAPO clusters and these must have anti-affinity applied to control plane nodes. (Use case: Region with a single AZ).

The preferred defaults would be anti-affinity for control plane to ensure separation and soft-anti-affinity for machines to allow scaling beyond available compute hypervisors.

Proposal:

Add ServerGroupAffinity string json:"serverGroupAffinity,omitempty" to OpenStackClusterSpec with valid values of anti-affinity, soft-anti-affinity (affinity, soft-affinity are valid in Nova, but not sensible here)

Note: This would need to apply to control plane nodes only, so perhaps it should be named with a CP prefix?

I note the restrictions of CAPI in creating the servergroup for MachineSets. Perhaps this is not in the first pass, but if we had an available ServerGroupAffinity for the MachineSet accessible we could create on the first Machine if it doesn't exist, and clean up if we remove the last machine from it. It would leave orphaned servergroups in some cases.

dalees commented 1 year ago

/reopen

k8s-ci-robot commented 1 year ago

@dalees: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1256#issuecomment-1804882009): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

jichenjc commented 1 year ago

/reopen

k8s-ci-robot commented 1 year ago

@jichenjc: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1256#issuecomment-1804952254): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

mkjpryor commented 1 year ago

@dalees Using pre-existing server groups is possible in the Helm charts already, so we could actually implement this fully in the Magnum driver if required.

I still think it would be nice to do in CAPO though.

For the CP nodes it is fairly obvious that we can just add a control plane affinity policy to OpenStackCluster and manage a server group for the CP nodes.

For the workers it is less obvious, as what you probably want is a server group per MD, but we only get to deal in machines in CAPO, and we aren’t supposed to care which MD they are in.

@mdbooth Didn’t you have ideas for reworking all the AZ logic? Maybe server groups could be worked into that?

mdbooth commented 1 year ago

Server groups won't live in the AZ logic when that (hopefully) happens. The reason is that AZs are basically a templating mechanism: they define a set of things which vary between machines. Server groups don't (must not) vary between members.

I would like to see us support server groups better than we do currently, which as Matt says is to just create one externally and reference it in the machine spec. I believe we will need a solution for this soon for OpenShift integration, so it will likely be on my TODO list... some time in the next 6 months, maybe? If somebody else wanted to implement it sooner I'd be delighted.

I don't want to add any more 'cluster-magic' to the machine controller. i.e. My preference is that the machine spec entirely specifies the machine, without reference to the cluster spec. So I'd be happy for the cluster controller to create the server group, I'd still want the control plane machine spec to reference it explicitly.

mkjpryor commented 1 year ago

So OpenStackCluster would have a list of server groups to create, which are then referenced from OpenStackMachine?

Could this be a good time to have an extra CRD for an OpenStackServerGroup maybe?

EmilienM commented 1 year ago

/remove-lifecycle rotten

EmilienM commented 1 year ago

@mkjpryor @dalees

Hey, I work with @mdbooth and we will need this one to be addressed soon so I'm very likely going to work on that.

A few questions:

Can we settle on having a new CRD for OpenStackServerGroup ? I think it's a great idea and indeed an opportunity to do it well in the first place.
Can you confirm that we'll want a new Controller as well? like it was suggested by @nikParasyr via https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1378#issuecomment-1535955155
I propose that we don't touch ServerGroupID, however when used CAPO wouldn't manage OpenStackServerGroup.
Are we focusing only on the control plane for now (fine by me)? or workers too?

End goal: CAPO needs to be able to create and manage the Server Groups (at a minimum for the Control Plane).

nikParasyr commented 1 year ago

I would like to see support for worker nodes as well if possible ( or a design which allows support for it to be added easily on a later stage ).

currently, we have clusters that:

have a single serverGroup used by all nodes ( control-plane + workers )
have a serverGroup for control-plane, and a second SG for all the worker nodes ( even with multiple node groups )
have a SG for control-plane, different SG per node group

most clusters follow the 2 pattern but there are some use cases where 1 or 3 is preferable.

dalees commented 1 year ago

@EmilienM - Great, if you'd like to take this forward and progress the CRD and controller for this, that'd be welcomed.

I am prepared to write the control plane part of this, but not ready to dedicate the time to write the CRD/Controller proposal and implementation. I may do the control plane part for my own learning and add a PR in the next few weeks. I don't expect this to be merged if a CRD based replacement is in the pipeline :)

Happy to follow, feedback and test. Having this functional for node groups would be great.

I can see a good use case for a different server group per node group, matching @nikParasyr 's case 3.

mkjpryor commented 12 months ago

@EmilienM

Personally, I really like the idea of an OpenStackServerGroup CRD, with the corresponding OpenStackMachine.spec.serverGroupRef field. This would make managing a server group per MD a breeze using our Helm charts.

mkjpryor commented 12 months ago

Just to clarify, I think users of CAPO should create instances of the OpenStackServerGroup CRD and assign them to machines using a spec.serverGroupRef, rather than CAPO managing the creation of OpenStackServerGroup CRD instances itself.

So the order of precedence for a machine would be:

If spec.serverGroupID is set, use that server group
If spec.serverGroupRef is set, use the server group created for the CRD instance
If neither are set, don't set a server group

CAPO will manage the actual OpenStack security group backing an OpenStackServerGroup CRD instance but will not create them automatically for clusters that do not request them (not every cluster will need or want server groups).

Does that make sense? @mdbooth?

dalees commented 9 months ago

This issue is now important to my current work, I'm planning to work on this next week and will propose a PR implementing the new CRD for OpenStackServerGroup and controller as @mkjpryor has outlined - I believe it covers the cases discussed in this issue so far.

@EmilienM please let me know if you have progressed this, otherwise I'm happy to pick this up now.

dalees commented 9 months ago

Quick update: I plan to upload a PR for feedback next week.

I have a working version of the OpenStackServerGroup CRD and controller. The OpenStackMachineSpec adds serverGroupRef into v1alpha8 which resolves once the server group is created. I'm now working on ensuring tests and edge cases are reasonable.

EmilienM commented 9 months ago

I'm honestly not sure we want a new CRD & controller for this as it might be too complex for our needs but I'm happy to be proven wrong. I'll wait to see what @dalees proposed.

dalees commented 9 months ago

I've uploaded a first pass at this, there are a few TODO's listed but it is functional.

On the topic of it being a CRD & controller, I can see this is fairly heavyweight for a simple feature, but it does provide a solution to all use cases in https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1256#issuecomment-1823199045 and deals with the lifecycle cleanly. That's the main benefit of this.

After we discussed earlier in this thread, I noticed ServerGroupID has been changed into ServerGroup (a filter that allows name or uuid). One option is to extend this further to include the expected policy, and create if it the named group doesn't already exist. The deletion criteria become less obvious if used for worker nodes, however.

Can you think of other options? How do we decide the way forward? I'm happy to join the next CAPO meeting to discuss.

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

dalees commented 6 months ago

/remove-lifecycle stale

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 month ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1256#issuecomment-2437047857): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / cluster-api-provider-openstack

Use a server group to ensure anti-affinity for control plane nodes #1256