karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.12k stars 807 forks source link

Introduce a mechanism to actively trigger rescheduling #4698

Open chaosi-zju opened 1 month ago

chaosi-zju commented 1 month ago

What type of PR is this?

/kind design /kind documentation

What this PR does / why we need it:

This proposal aims to introduce a mechanism of active triggering rescheduling, which benefits a lot in application failover scenarios. This can be realized by introducing a new API, and a new field would be marked when this new API called, so that scheduler can perceive the need for rescheduling.

Which issue(s) this PR fixes:

Fixes part of #4840

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

codecov-commenter commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 52.97%. Comparing base (aded7c0) to head (7fc9c12).

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #4698 +/- ## ========================================== - Coverage 52.99% 52.97% -0.02% ========================================== Files 250 250 Lines 20420 20420 ========================================== - Hits 10821 10817 -4 - Misses 8880 8883 +3 - Partials 719 720 +1 ``` | [Flag](https://app.codecov.io/gh/karmada-io/karmada/pull/4698/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/karmada-io/karmada/pull/4698/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | `52.97% <ø> (-0.02%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

wu0407 commented 1 month ago

This Pr mixes fault self-healing and rescheduling. I think fault self-healing includes rescheduling, similar to when a node crashes, the workload corresponding to the pod on the node will regenerate the pod. This is completed by multiple controllers working together, including a scheduler. If the goal is self-healing, then multiple components need to be considered for coordination. If it is only rescheduling, then only the target of eviction and the conditions for stopping eviction need to be considered. Can we consider the design concept of the Descheduler project in the community

karmada-bot commented 2 weeks ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please ask for approval from rainbowmango after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/karmada-io/karmada/blob/master/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment