kubevirt / community

Community content
https://kubevirt.io
46 stars 101 forks source link

Design proposal: Descheduler support #258

Closed orelmisan closed 3 months ago

orelmisan commented 4 months ago

What this PR does / why we need it: Add a design proposal to support running KubeVirt in a cluster that also has a descheduler.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #

Special notes for your reviewer: PR https://github.com/kubevirt/kubevirt/pull/11286 describes the current way KubeVirt handles eviction requests.

Descheduler proposal: https://github.com/kubernetes-sigs/descheduler/pull/1354

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR. Approvers are expected to review this list.

Release note:

NONE
kubevirt-bot commented 4 months ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

orelmisan commented 4 months ago

/cc @acardace @stu-gott

orelmisan commented 4 months ago

@RobertKrawitz Could you please have a look?

orelmisan commented 4 months ago

@ingvagabund Could you please have a look?

orelmisan commented 4 months ago

If I understand it correctly, this alone will not solve the descheduler problem; this is just one piece of the puzzle. Correct?

Thank you for the feedback @RobertKrawitz. I was hoping to understand from you whether this design would solve the descheduler problem. What are the gaps that you see?

AFAIU from my discussions with @stu-gott, the agreement was:

  1. Add the descheduler.alpha.kubernetes.io/request-evict-only annotation to all virt-launcher pods.
  2. Add the descheduler.alpha.kubernetes.io/eviction-in-progress annotation to virt-launcher pods which received the eviction request and will be migrated from the node because of it.
RobertKrawitz commented 4 months ago

If I understand it correctly, this alone will not solve the descheduler problem; this is just one piece of the puzzle. Correct?

Thank you for the feedback @RobertKrawitz. I was hoping to understand from you whether this design would solve the descheduler problem. What are the gaps that you see?

AFAIU from my discussions with @stu-gott, the agreement was:

  1. Add the descheduler.alpha.kubernetes.io/request-evict-only annotation to all virt-launcher pods.
  2. Add the descheduler.alpha.kubernetes.io/eviction-in-progress annotation to virt-launcher pods which received the eviction request and will be migrated from the node because of it.

The descheduler itself needs to respect these new annotations. It isn't clear to me from the description whether it does.

orelmisan commented 4 months ago

If I understand it correctly, this alone will not solve the descheduler problem; this is just one piece of the puzzle. Correct?

Thank you for the feedback @RobertKrawitz. I was hoping to understand from you whether this design would solve the descheduler problem. What are the gaps that you see? AFAIU from my discussions with @stu-gott, the agreement was:

  1. Add the descheduler.alpha.kubernetes.io/request-evict-only annotation to all virt-launcher pods.
  2. Add the descheduler.alpha.kubernetes.io/eviction-in-progress annotation to virt-launcher pods which received the eviction request and will be migrated from the node because of it.

The descheduler itself needs to respect these new annotations. It isn't clear to me from the description whether it does.

Yes, the descheduler needs to respect the new annotations. It is stated under "Non Goals".

RobertKrawitz commented 4 months ago

If I understand it correctly, this alone will not solve the descheduler problem; this is just one piece of the puzzle. Correct?

Thank you for the feedback @RobertKrawitz. I was hoping to understand from you whether this design would solve the descheduler problem. What are the gaps that you see? AFAIU from my discussions with @stu-gott, the agreement was:

  1. Add the descheduler.alpha.kubernetes.io/request-evict-only annotation to all virt-launcher pods.
  2. Add the descheduler.alpha.kubernetes.io/eviction-in-progress annotation to virt-launcher pods which received the eviction request and will be migrated from the node because of it.

The descheduler itself needs to respect these new annotations. It isn't clear to me from the description whether it does.

Yes, the descheduler needs to respect the new annotations. It is stated under "Non Goals".

Exactly -- without that, the solution is not complete.

orelmisan commented 4 months ago

Change: Following @EdDev's advice - moved the way KubeVirt currently handles eviction requests to a separate document in order to simplify this document.

orelmisan commented 4 months ago

Change: Fixed comments from @fabiand's review.

orelmisan commented 4 months ago

Change1, Change2: Answered @dankenigsberg's review comments.

ingvagabund commented 4 months ago

Descheduler proposal: https://github.com/kubernetes-sigs/descheduler/pull/1354

acardace commented 3 months ago

@EdDev @dankenigsberg @fabiand @ingvagabund @RobertKrawitz, are any other changes required for this design proposal? If not would you lgtm so we can get it merged?

fabiand commented 3 months ago

@acardace what would you as an approver say? Is this proposal aligned with what we expected on the descheduelr side? And does @ingvagabund bless this design here?

acardace commented 3 months ago

/lgtm

@fabiand looks thorough to me.

fabiand commented 3 months ago

@acardace great - and are you in sync with @ingvagabund ?

acardace commented 3 months ago

@acardace great - and are you in sync with @ingvagabund ?

Can you elaborate? I'm just doing a gentle ping to all parties involved in the review to move this on, it's been a couple of weeks without progress.

fabiand commented 3 months ago

I'm just trying to say that the approving approver should ensure that the design is working for us, but like in this case to also ensure that everything we depend on is also good.

ingvagabund commented 3 months ago

Combination of 429 - VM will be migrated (code and text/message) and the annotations is the crucial part for the descheduler. LGTM

RobertKrawitz commented 3 months ago

/lgtm

kubevirt-bot commented 3 months ago

@RobertKrawitz: changing LGTM is restricted to collaborators

In response to [this](https://github.com/kubevirt/community/pull/258#issuecomment-1988420294): >/lgtm Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
stu-gott commented 3 months ago

/approve

This proposal matches expectations of what KubeVirt should do. The descheduler team has signed off on our approach.

stu-gott commented 3 months ago

@fabiand it appears you're the only approver for this repo participating in this conversation.

fabiand commented 3 months ago

/approve

According to Stu's comment

kubevirt-bot commented 3 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabiand, stu-gott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubevirt/community/blob/main/OWNERS)~~ [fabiand] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment