kubernetes / enhancements

Enhancements tracking repo for Kubernetes
Apache License 2.0
3.44k stars 1.49k forks source link

Declarative Node Maintenance #4212

Open atiratree opened 1 year ago

atiratree commented 1 year ago

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

atiratree commented 1 year ago

/sig apps

npolshakova commented 1 year ago

Hello @atiratree, 1.29 Enhancements team here! Is this enhancement targeting 1.29? If it is, can you follow the instructions here to opt in the enhancements and make sure the lead-opted-in label is set so it can get added to the tracking board? Thanks!

sftim commented 1 year ago

Title suggestion: “Declarative node maintenance”

(all KEPs are implicitly improvements, or at least they aim to be)

atiratree commented 1 year ago

Thanks for the suggestion. I have updated the title and the KEP.

atiratree commented 1 year ago

@npolshakova this still needs to be discussed with all the interested sigs before we can target this.

atiratree commented 1 year ago

this has been discussed and the KEP still needs some work and additional discussions, so we will try to target this for the next release

alculquicondor commented 11 months ago

/sig node

sftim commented 10 months ago

@atiratree, how can people best help move this work forward?

atiratree commented 10 months ago

I will revisit this one soon. I need to process the reviews and my personal notes.

atiratree commented 10 months ago

The KEP should be up to date now.

sftim commented 9 months ago

Some relevance to: https://github.com/kubernetes/website/issues/44998 (about the existing docs for node maintenance)

beekhof commented 7 months ago

In case it helps, Red Hat has been shipping https://github.com/medik8s/node-maintenance-operator for ~5 years. It almost certainly contains some OpenShift-isms, but I'm sure they can be addressed if there is interest.

atiratree commented 7 months ago

@beekhof I am aware of this operator and it was considered when designing this feature. Please also see https://github.com/kubernetes/enhancements/blob/cd1ea31e1c09c2f4e9f6a7f35821ff14f41a2f78/keps/sig-apps/4212-declarative-node-maintenance/README.md#out-of-tree-implementation

atiratree commented 7 months ago

The Declarative Node Maintenance KEP is becoming too complex and it is hard to capture all aspects and review everything in a single place. I have opened a second KEP just for the Evacuation API: https://github.com/kubernetes/enhancements/issues/4563 and it will be a prerequisite for the Node Maintenance.

sftim commented 7 months ago

@beekhof if we did prototype node maintenance out of tree, with an alpha API group, we'd want to teach Kubernetes tooling about using it (for example, kubectl drain - but also many other in-project pieces).

To me, the value comes from that integration work more than from writing the actual controller. If you agree, we could look at rallying effort around an out-of-tree prototype. The integrations would remain valuable even after a move to in-tree implementation.

soltysh commented 5 months ago

/assign @atiratree

/label lead-opted-in /stage alpha /milestone v1.31

atiratree commented 5 months ago

@soltysh I would like to consider this for alpha in 1.32, not in this release. To get more feedback on the feature and to lead the way with the Evacuation API (dependency) first.

soltysh commented 5 months ago

/milestone clear

dipesh-rawat commented 5 months ago

Hello @soltysh @atiratree 👋, 1.31 Enhancements team here.

Now that this KEP is not targeting for release 1.31 (reference https://github.com/kubernetes/enhancements/issues/4212#issuecomment-2152097655), should we also consider removing the lead-opted-in label too?

soltysh commented 5 months ago

Yeah, I guess you're right.

/remove-label lead-opted-in

sftim commented 5 months ago

Is there a baseline of detail that we can define and merge as provisional?

soltysh commented 5 months ago

This and https://github.com/kubernetes/enhancements/issues/4563 where discussed heavily again during Monday's SIG-Apps call, with the general hesitation from SIG leads we're going to hold on with these efforts in favor of looking into extending PDB and eviction API.

sreeram-venkitesh commented 4 months ago

/label tracked/no

sftim commented 2 months ago

I'm tempted to propose an out-of-tree implementation (CRD, controller) and maybe kubectl plugin. kubectl experimental-drain perhaps?

kannon92 commented 2 months ago

Reading this, we are not considering this for 1.32?

atiratree commented 2 months ago

Not considering for 1.32. We have to get a broader consensus before we can accept this into the core. Please see https://groups.google.com/g/kubernetes-sig-architecture/c/Tb_3oDMAHrg for more details. I would like to build the consensus with the community, but did not have a time to pick up the topic yet. Hopefully we can do that soon.

We are also having discussions about the https://github.com/kubernetes/enhancements/pull/4565 and will follow up on that in the next sig-apps meeting, which would help the NodeMaintenance feature.

atiratree commented 2 months ago

I'm tempted to propose an out-of-tree implementation (CRD, controller) and maybe kubectl plugin. kubectl experimental-drain perhaps?

What would be the benefit when compared to the other drain solutions? Or extending the other drain solutions? Nevertheless, I think it is better to start with the discussions first.

sftim commented 2 months ago

What would be the benefit when compared to the other drain solutions? Or extending the other drain solutions?

Personally, I want to be able to initiate a declarative drain via kubectl. The only in-tree element there would be a change to kubectl drain.

atiratree commented 2 months ago

We cannot make it declarative unless we have the API and I am not positive we can accept 3rd party APIs into kubectl. But we could consider starting with the NodeMaintenance logic + Evacuation API in kubectl.

sftim commented 2 months ago

I am not positive we can accept 3rd party APIs into kubectl

Did you mean “out of tree”? The out-of-tree implementation I had in mine would still be part of Kubernetes, but not part of https://github.com/kubernetes/kubernetes (other than the kubectl support, which needs to be in-tree if not done using a plugin).

atiratree commented 2 months ago

Ah, yeah, that could work if sig-cli would accept it. However, if there is an API, there is less need for kubectl integration and just creating the object could be enough.