kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.49k stars 1.29k forks source link

Graduate MachinePools from experimental to stable API #9005

Open mboersma opened 1 year ago

mboersma commented 1 year ago

What would you like to be added (User Story)?

As a production user of CAPI, I'm not comfortable working with "experimental" parts of the product, but I would like to take advantage of the native VM-grouping construct at my cloud provider.

Detailed Description

MachinePools have been part of the experimental API for years, but their API has been overall stable (recent backward-compatible changes for MachinePool Machines notwithstanding). Moving them out of exp/ will declare that MachinePools are stable, supported, and safe to use.

Anything else you would like to add?

The graduation criteria in the MachinePools proposal remains TBD. Informally, when the question "when can MPs graduate?" has been asked, the answer has generally been "after MachinePool Machines land."

MachinePool Machines have begun to land and should soon be largely complete. This seems like the time to consider graduating MachinePools, although there may be more context or specific // TODO items I'm not aware of. I'm hoping to identify those in this issue.

Label(s) to be applied

/kind feature

/cc @dtzar @Jont828

killianmuldoon commented 1 year ago

/triage accepted

jayesh-srivastava commented 1 year ago

@mboersma I would like to take this one up. Can we have discussions over subtaks/TODOs which can help me moving forward.

fabriziopandini commented 1 year ago

@mboersma AFAIK as far as I remember we have still some to do left from @Jont828 PR in order to complete the implementation of the proposal and get feature parity with MD (this is at least what I remember from some thread on the PR and the discussion we had around it and the - divide and conquer from the original PR- ) e.g. label propagation, CC support, but also machines that can't be operated, probably more

It will be great to use this issue to track all this work

vincepri commented 1 year ago

It'd be also good to have a featureset comparison in the book to guide users when they should be using MachineDeployment or MachinePool

Jont828 commented 1 year ago

Sounds good. #8842 is the last PR I currently have open related to MachinePool Machines. I think it makes sense to put a list of follow up tasks here once that gets wrapped up.

@vincepri We have a section in the CAPI book about differences in features but I suppose we can expand it.

CecileRobertMichon commented 10 months ago

https://github.com/kubernetes-sigs/cluster-api/issues/8858 should probably get resolved as well before we do this

AndiDog commented 7 months ago

Apart from some notes in https://github.com/kubernetes-sigs/cluster-api/issues/8858, here are some comments about the API.

Other than that, I think stabilizing the API is a good idea – even a stable API can be evolved, for instance by adding new fields if we really run into major missing features.

sbueringer commented 7 months ago

And since I wanted to understand this, I quickly searched in source code and didn't find this field used for machine pools – am I mistaken?

This should be fixed via: https://github.com/kubernetes-sigs/cluster-api/pull/9837

Sounds like there's potential to improve the godoc for MD & MachinePool, but apart from that it seems to have a clear meaning

dtzar commented 7 months ago

Per the CAPI community call today as I understand it, the next steps are:

  1. Immediately have someone PR to have Machinepool get deployed by default (without requiring feature flag being enabled). Push it also from alpha to beta status at this time.
  2. Gather additional feedback after this change and consider setting a timeline with updated note(s) in the spec
mboersma commented 6 months ago

/assign

fabriziopandini commented 5 months ago

/priority important-long term

@mboersma is there some work left for this issue or can we close it

k8s-ci-robot commented 5 months ago

@fabriziopandini: The label(s) priority/important-long, priority/term cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/9005#issuecomment-2050286304): >/priority important-long term > >@mboersma is there some work left for this issue or can we close it Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
fabriziopandini commented 5 months ago

/priority important-longterm

mboersma commented 4 months ago

@fabriziopandini MachinePools are "beta" now, but I think we need to move the code out of exp somehow. I haven't thought about how to do that yet, but it's on my plate to do.

enxebre commented 2 months ago

Thanks @mboersma is there any report that can be shared of the unit and e2e test coverage for MachinePools?

sbueringer commented 2 months ago

I'll try to find some time next week to write a summary of the current state of MPs regarding open issues and features we were planning to implement at some point but didn't until today.

Then we can discuss how they relate to the remaining steps to get MPs out of experimental. If I understand correctly, the remaining steps are moving the implementation out of "exp" and eventually dropping the feature gate.

In any case, from my side no objection against moving the code out of "exp". I see no reason why it should stay there. In my opinion for our users it should not be relevant in which package we implement features. (I mean which end user even knows that MachinePools are below "exp" in our repo, and why should they have to care?)

sbueringer commented 1 month ago

Sorry for the delay. Here's my summary and my first try at a categorization.

Fundamental issues / questions about the MachinePool API / contract:

MachinePool Machines: (not clear to me, should be double checked by folks more familiar with MP Machines)

(Potential) Bugs:

Misc / TBD:

Minor improvements:

As far as I can tell regarding graduation - apart from the issues listed here - the following is missing:

We can/should discuss which parts of the "graduation" we want to block until which issues are resolved. I"m personally fine with moving the code out of exp, but we should then probably treat removing the feature gate as the "graduation".

In general, it would be great to see some folks signing up to be maintainers of the MachinePool feature/code. To be honest, at the moment, it's mostly up to folks which are neither very familiar with MachinePools nor are they using the feature. This is not a great state if we're planning to maintain this feature going forward, which should be the minimum bar for graduation.

cc @fabriziopandini @vincepri @enxebre @killianmuldoon @chrischdi @mboersma

fabriziopandini commented 1 month ago

Great work @sbueringer! 🥳 Having a list of all the open issues + all those valuable notes helps me to understand better where we are, and I guess it helps others too.

Up to @mboersma if we want to move this to an Umbrella issue for graduation.

If I can give my two cents, I consider ownership + open issues the key points to be addressed before graduation; I also agree that moving the code out of exp is sort of orthogonal to graduation given that we are already shipping MachinePool APIs together with everything else.