Support clusterctl alpha rollout kthreescontrolplane/...

wikoion commented 6 months ago

What would you like to be added (User Story)?

As a developer I would like to add support to clusterctl to rollout kthreescontrolplanes, similar to the support for kubeadm controlplanes.

Detailed Description

I'd like to discuss approach, I'm happy to do the work, but wanted to get some advice on how we could implement this. The easy option is something like this, but I understand that it might not be desireable to import these external packages. If that is the case could we refactor rollout to support any kind of resource that implementes rolloutAfter in it's spec? Currently cluster-api-k3s uses upgradeAfter, but if we chose this option we could add another api version similar to what was done in this project.

Anything else you would like to add?

No response

Label(s) to be applied

/kind feature One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

k8s-ci-robot commented 6 months ago

This issue is currently awaiting triage.

If CAPI contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

neolit123 commented 6 months ago

refactor rollout to support any kind of resource

makes sense to me. maybe the rollout command should be CP provider agnostic.

sbueringer commented 6 months ago

I think we should not start hard-coding non-"CAPI core" providers

fabriziopandini commented 6 months ago

Agreed, and considering https://github.com/kubernetes-sigs/cluster-api/issues/10479, we should not add more on top of the clusterctl rollout commands.

sbueringer commented 6 months ago

I think triggering a rollout should probably not be part of #10479 as we were not planning to deprecate the rolloutAfter fields

fabriziopandini commented 6 months ago

I think triggering a rollout should probably not be part of https://github.com/kubernetes-sigs/cluster-api/issues/10479 as we were not planning to deprecate the rolloutAfter fields

You are correct, sorry I wasn't specific in my reference. What I was thinking is the fact that most of the considerations in the linked issue applies also here:

This feature (afaik) has been mostly copied from k/k Deployments & ReplicaSets
As far as we are aware it is barely used by CAPI users (we barely get any questions / requests about the entire feature)
It's not usable with GitOps
It's not usable with ClusterClass

Let me also add on top of that that AFAIK there is not a contract defining rolloutAfter field and as far as I'm aware of it is not used consistently. e.g. also in CAPI core, machines pools do not use it.

In other words, the idea to extend what we copied from k/k Deployments & ReplicaSets lays to other objects seems based on brittle foundations, and I wonder if also in this case we should really consider if it's worth the complexity and otherwise consider deprecation.

wikoion commented 6 months ago

I think triggering a rollout should probably not be part of #10479 as we were not planning to deprecate the rolloutAfter fields

You are correct, sorry I wasn't specific in my reference. What I was thinking is the fact that most of the considerations in the linked issue applies also here:
* This feature (afaik) has been mostly copied from k/k Deployments & ReplicaSets

* As far as we are aware it is barely used by CAPI users  (we barely get any questions / requests about the entire feature)

* It's not usable with GitOps

* It's not usable with ClusterClass
Let me also add on top of that that AFAIK there is not a contract defining rolloutAfter field and as far as I'm aware of it is not used consistently. e.g. also in CAPI core, machines pools do not use it.

In other words, the idea to extend what we copied from k/k Deployments & ReplicaSets lays to other objects seems based on brittle foundations, and I wonder if also in this case we should really consider if it's worth the complexity and otherwise consider deprecation.

For what it's worth, we use this feature at Influx. It is very useful to be able to perform a rolling restart of machinedeployments, in the same way that it's useful to be able to perform a rolling restart of a kubernetes deployment or sts. When we replace an AWSMachineTemplate we have to roll the machinedeployments for example

sbueringer commented 6 months ago

I would really like to keep the existing rolloutAfter fields on MD & KCP. I think we don't get much requests because they simply work.

For MachineDeployments it is the only way to trigger a Machine rollout without making substantial changes (labels & annotations don't trigger rollouts anymore, which is why we introduced rolloutAfter for MachineDeployments). Not sure about KCP, but it might be the same there.

I would assume rolloutAfter also works with GitOps.

fabriziopandini commented 5 months ago

I would really like to keep the existing rolloutAfter fields on MD & KCP. I think we don't get much requests because they simply work.

I never said we have to remove those fields from MD & KCP. I'm just thinking about the clusterctl rollout command, and most specifically about the fact that we should not extend it outside of the current scope (and btw, the KCP implementation, which is in the current scope, is not based on any contract)

sbueringer commented 5 months ago

Ah got it. I interpreted the comment here like we should drop rolloutAfter as well: https://github.com/kubernetes-sigs/cluster-api/issues/10621#issuecomment-2124622245

I think the reason we merged the KCP support at the time was because KCP is also part of core CAPI (which I think is also the reason why we found it okay to handle KubeadmConfig / KCP specifically in some of our core controllers).

fabriziopandini commented 5 months ago

Given that the consensus is to not expand this other than KCP (special case), I'm leaning toward closing this issue and the corresponding PR

wikoion commented 5 months ago

I would like to argue again for this being a useful feature of clusterctl and is frequently used for machine deployments, would be very useful to also be able to do it with our controlplanes (k3s)

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

vaibhav2107 commented 2 months ago

/remove-lifecycle stale

fabriziopandini commented 2 months ago

@vaibhav2107 what is the reason to remove lifecycle stale? It is expected, and healthy for the project, that if issue that are not active gets closed by the automation. This helps contributors and maintainers to focus on the ongoing work

fabriziopandini commented 2 months ago

talked with @vaibhav2107, label was removed by mistake

/lifecycle stale

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 week ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/10621#issuecomment-2466235740): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / cluster-api