kubernetes-sigs / karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Apache License 2.0
639 stars 207 forks source link

enhanced pod eviction capabilities #690

Open bdols opened 1 year ago

bdols commented 1 year ago

Description

What problem are you trying to solve?

For Strimzi's Kafka Operator, to achieve HA, it's important to let its operator drain pods based on its internal readiness. To do this, Strimzi recommends using their Drain Cleaner (Strimzi blog post announcement) , which essentially creates a ValidatingWebhook for pod evictions and adds an annotation to the pods (kafka and zookeeper) that it manages. This then means that to make use of the Drain Cleaner that a PDB with maxUnavailable of 0 needs to be set, and, so, this does not allow for karpenter to consolidate any node with kafka or zookeeper pods running on them. Karpenter creates a DeprovisioningBlocked Event for this: "Cannot deprovision node due to pdb --zookeeper prevents pod evictions"

Some potential features that could address this: provide a capability for the Provisioner to set a pod annotation to mark a pod for eviction for specific workloads which then may not require the Strimzi Drain Cleaner. I imagine that this would need an accompanying timeout for successful eviction on cordoned nodes. Or, a Provisioner setting to ignore/override PDBs and drain the node anyway, but seems like that would break the PDB contract.

How important is this feature to you?

The workaround for this is to have dedicated hardware sized for the workloads along with a dedicated provisioner that set node taints, but zookeeper, for example, may not require a lot of CPU/memory so this can be considered idle waste. Ideally, it would be better to have a choice on whether these workloads can run on nodes capable of handling other workloads.

jonathan-innis commented 1 year ago

Just to make sure that I understand correctly, this is effectively an ask to ignore certain PDBs when evaluating nodes for eviction because this PDB is in place so that Strimzi can handle the eviction process when a node eviction is requested and not the standard eviction manager?

jonathan-innis commented 1 year ago

This definitely seems like an interesting use-case but I'm curious how common this use-case is in the ecosystem.

does not necessarily correspond to the replicas being in sync

It seems odd to me that Kafka readiness wouldn't be based on the pod actually being fully replicated and being in-sync. Is this something that is being considered upstream in Kafka so that this node drain cleaner component doesn't have to be built on top of the existing eviction manager to handle this case?

bdols commented 1 year ago

Just to make sure that I understand correctly, this is effectively an ask to ignore certain PDBs when evaluating nodes for eviction because this PDB is in place so that Strimzi can handle the eviction process when a node eviction is requested and not the standard eviction manager?

yes. I am not speaking for their project or anything.

This definitely seems like an interesting use-case but I'm curious how common this use-case is in the ecosystem.

I can think of another possible use. ElasticSearch has an option in its shutdown API for removing/replacing nodes versus an in-place restart, which is more of a consideration for local HostPath volumes vs EBS. https://www.elastic.co/guide/en/elasticsearch/reference/current/put-shutdown.html

If karpenter deprovisioning was configured with ttlSecondsUntilExpired, a 'replace' would be appropriate instead of the 'restart' default, but currently, it appears that ECK only allows for one setting via an env var on the running container: https://www.elastic.co/guide/en/cloud-on-k8s/2.8/k8s-prestop.html

There may be other HA data storage services that handle their own orchestration. I had a thought that PodSpec could be enhanced with something like a "disruptableness" to indicate whether an application is not able to handle a disruption while still be ready to service requests. Or, including the Elastic case, now, "drainableness". For Kafka , I could see a pod marking itself as "drainable" once all the replicas are rebalanced off.

k8s-triage-robot commented 9 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

jonathan-innis commented 9 months ago

/remove-lifecycle stale

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 5 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 4 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/karpenter/issues/690#issuecomment-2198492894): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
jonathan-innis commented 4 months ago

/reopen

k8s-ci-robot commented 4 months ago

@jonathan-innis: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/karpenter/issues/690#issuecomment-2198676172): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
k8s-ci-robot commented 4 months ago

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
jonathan-innis commented 4 months ago

/remove-lifecycle rotten

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

tinder-robolague commented 1 month ago

/remove-lifecycle stale