karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.12k stars 806 forks source link

custom enable or disable of scheduler plugins #2135

Closed chaunceyjiang closed 1 year ago

chaunceyjiang commented 1 year ago

Signed-off-by: chaunceyjiang chaunceyjiang@gmail.com

What type of PR is this? /kind feature

What this PR does / why we need it: custom enable or disable of scheduler plugins

Which issue(s) this PR fixes: Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

`karmada-scheduler`: Introduced `--plugins` flag to enable or disable scheduler plugins 
chaunceyjiang commented 1 year ago
XiShanYongYe-Chang commented 1 year ago

Thanks~

/assign

chaunceyjiang commented 1 year ago

Hi,how's the progress? @XiShanYongYe-Chang

XiShanYongYe-Chang commented 1 year ago

Hi,how's the progress? @XiShanYongYe-Chang

Sorry. I'm going to start reviewing now.

chaunceyjiang commented 1 year ago

/cc @XiShanYongYe-Chang @Poor12

Poor12 commented 1 year ago

/lgtm

XiShanYongYe-Chang commented 1 year ago

/lgtm

/cc @RainbowMango @likakuli

karmada-bot commented 1 year ago

@XiShanYongYe-Chang: GitHub didn't allow me to request PR reviews from the following users: likakuli.

Note that only karmada-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to [this](https://github.com/karmada-io/karmada/pull/2135#issuecomment-1178484336): >/lgtm > >/cc @RainbowMango @likakuli Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
RainbowMango commented 1 year ago

By the way, given the plugin name will be exposed to users, I'd like to request a final review of the plugin names. https://github.com/karmada-io/karmada/blob/821a18542e57ae5ecf5fdc44bd3fa14c586f4a66/pkg/scheduler/framework/plugins/registry.go#L15-L19

Any idea would be welcome!

chaunceyjiang commented 1 year ago

which plugin you want to disable?

Not yet. In the karmada meeting a few days ago, I seemed to hear you say that you need to make the scheduling plugin configurable. ٩( 'ω' )و

chaunceyjiang commented 1 year ago

I'd like to request a final review of the plugin names.

At present, I don't have any good ideas, we can open a new issue to track it

RainbowMango commented 1 year ago

Yes, thanks. I heard from @likakuli and @prodanlabs :)

likakuli commented 1 year ago

cc @likakuli Who might need it?

yeah, this is one of what we need, another one is to use taint filter instead of directly filtering unready clusters, thx

There is another pr to allow out-of-tree plugins but can't disable default plugins. https://github.com/karmada-io/karmada/pull/1663

prodanlabs commented 1 year ago

I personally prefer the --disable-plugins flag, FYI.

prodanlabs commented 1 year ago

I am thinking that we can provide a karmada-scheduler with only basic plugins, and users can build their own scheduler based on this karmada-scheduler according to their own needs.

We allow karmada to have multiple schedulers. When distributing workloads, a field is used to declare which scheduler to use, such as schedulerName: demo-scheduler

what do you think @RainbowMango @Garrybest @kerthcet

RainbowMango commented 1 year ago

yeah, this is one of what we need, another one is to use taint filter instead of directly filtering unready clusters, thx

I remember that(#1854 ), it still on my TODO list :)

RainbowMango commented 1 year ago

I am thinking that we can provide a karmada-scheduler with only basic plugins, and users can build their own scheduler based on this karmada-scheduler according to their own needs.

Given not all users need to extend the scheduler on their own, I prefer to enhance scalability(like this PR) and make it easy for users to choose.

Garrybest commented 1 year ago

Where the field schedulerName exist? In PropagationPolicy?

RainbowMango commented 1 year ago

Where the field schedulerName exist? In PropagationPolicy?

https://github.com/karmada-io/karmada/blob/821a18542e57ae5ecf5fdc44bd3fa14c586f4a66/pkg/apis/policy/v1alpha1/propagation_types.go#L85

But the scheduler now does not use it yet. I guess @prodanlabs did it by himself.

Garrybest commented 1 year ago

Cool, I like this feature.

kerthcet commented 1 year ago

From a high level perspective, managing configurations via flags poses certain problems like they are unversioned, the maintenance burden grows as the number of flags grows, flags exist in a flat namespace which is hard to organize structuring configurations.

Considering we have quite a number of configations in karmada scheduler, and we like to have more in the future, I think we should migrate to Conponent Config gradually. Scheduler plugins can also be part of this.

I think this is where we want to forward, but now, back to the reality, we have rarely plugins in karmada, what we settled in this PR can be a transition plan, since many users are eager for this IIRC.

Regarding to @prodanlabs proposal, different scheduler for different scenario makes sense to me, and we also have a entrance already. If we're planning to move forward, I think Component Config would be a better choice.

prodanlabs commented 1 year ago

Regarding the issue of multiple schedulers, we can open another issues discussion.

chaunceyjiang commented 1 year ago

So, for this PR, is there anything else I need to update?

@RainbowMango

kerthcet commented 1 year ago

I have several questions:

  1. If we don't set --plugins, what's the result, it seems like no plugins is registered?
  2. If not all plugins defined in karmada enabled by default, like we have plugin sets A, B, C, D1, D2, we enable A, B, C, D1 by default, but now, we want A, B, C, D2 enabled, D1 disabled, how to cover this, --plugisn="A,B,C,D2,-D1"?
chaunceyjiang commented 1 year ago

If we don't set --plugins, what's the result, it seems like no plugins is registered?

All plugins are enabled by default image

RainbowMango commented 1 year ago

@chaunceyjiang Just opened #2170 for the naming thing.

So, for this PR, is there anything else I need to update?

I'll get back to this PR after #2170. I'll look at if we have another way to implement it just like I mentioned at https://github.com/karmada-io/karmada/pull/2135#discussion_r916664527.

RainbowMango commented 1 year ago

/assign

karmada-bot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/karmada-io/karmada/blob/master/OWNERS)~~ [RainbowMango] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
chaunceyjiang commented 1 year ago

There are resource constraints on GitHub actions in each repo. Now there are too many E2E kind clusters, and contributors are relatively active during the day. The probability of E2E error is too high. It is suggested to reduce the number of E2E kind clusters. Can we cancel the E2E test (v1.21.10)

RainbowMango commented 1 year ago

There are resource constraints on GitHub actions in each repo.

Really? Where did you see that?

The probability of E2E error is too high. It is suggested to reduce the number of E2E kind clusters. Can we cancel the E2E test (v1.21.10)

I know @XiShanYongYe-Chang and @mrlihanbo are trying to figure out the reason, but don't really have a clue yet. Yeah, we can do that if it is the root cause.

chaunceyjiang commented 1 year ago

https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration#usage-limits

@RainbowMango

RainbowMango commented 1 year ago

Yeah, I know the resource limit, but we haven't reached the limit, right?

The e2e tests(v1.21,v1.22,v1.23,v1.24) are run in different virtual machines, they don't affect each other in theory.

XiShanYongYe-Chang commented 1 year ago

Hi @chaunceyjiang, our CI is far from reaching the limit of github action. Currently, the failed CIs are mainly in the v1.24 k8s cluster, the cause of the failure is that karmada-apiserver restarts multiple times. Removing the v1.21 k8s cluster may not solve the problem.

We have not figured out the root cause of the CI failure of v1.24. k8s cluster. Can you help figure it out together?