kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.1k stars 39.41k forks source link

Move scheduler Score plugin weight defaulting to API #106426

Open damemi opened 2 years ago

damemi commented 2 years ago

The scheduler framework instantiation currently handles defaulting the weights of Score plugins, as well as checking that the total weight is not more than the maximum:

https://github.com/kubernetes/kubernetes/blob/9dc0489/pkg/scheduler/framework/runtime/framework.go#L292-L306

this should be moved to the API defaulting itself to catch this earlier and be more secure /sig scheduling /kind feature

k8s-ci-robot commented 2 years ago

@damemi: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
kerthcet commented 2 years ago

I can give some help. /assign

NikhilSharmaWe commented 2 years ago

@damemi I would like to collaborate. Could you please elaborate and tell in what files the changes should be made for handling the weight of Score Plugins in API Defaulting.

damemi commented 2 years ago

@NikhilSharmaWe it looks like @kerthcet has already offered to work on this, but if you would like to collaborate on it feel free. The code in question is located in https://github.com/kubernetes/kubernetes/blob/9dc0489/pkg/scheduler/framework/runtime/framework.go#L292-L306, and we are interested in moving it to the actual API defaulting sections, probably https://github.com/kubernetes/kubernetes/blob/9dc0489c1ae3540453d94348efcc536585c31805/pkg/scheduler/apis/config/v1beta2/defaults.go

kerthcet commented 2 years ago

Yes, I'm working on this issue also, but I haven't push commits yet for I still have some questions, let's discuss about it here. Correct me if any mis-understandings.

Since we'll merge out-of-tree plugins here, I think move the codes to default.go will miss some plugins.

And I prefer to move the code here, as we have already validate the QueueSort Plugins here. WDYT @damemi

damemi commented 2 years ago

@kerthcet I don't think out-of-tree plugins will miss the validation if we move it to API. This is going to be agnostic of the specific plugin types, validating that there is just a weight set for each score plugin and that those weights aren't more than the max score. Another possible location could be here: https://github.com/kubernetes/kubernetes/blob/9dc0489c1ae3540453d94348efcc536585c31805/pkg/scheduler/apis/config/validation/validation.go

I also don't think moving the code to profile.go is what we want either. We're validating QueueSort plugins in that spot because we need to check across each individual profile("framework") to make sure there are no conflicts. The reason that check is in that spot is because that's after each framework/profile has been initialized. But, we could check the score weights on individual profiles as they're created. Does that make sense?

Ultimately, the goal isn't just to move this code somewhere else. We specifically want it in API validation/defaulting to pick up that early, stronger validation. If we can't do that, we don't have much reason to move it elsewhere

kerthcet commented 2 years ago

@kerthcet I don't think out-of-tree plugins will miss the validation if we move it to API. This is going to be agnostic of the specific plugin types, validating that there is just a weight set for each score plugin and that those weights aren't more than the max score. Another possible location could be here: https://github.com/kubernetes/kubernetes/blob/9dc0489c1ae3540453d94348efcc536585c31805/pkg/scheduler/apis/config/validation/validation.go

I also don't think moving the code to profile.go is what we want either. We're validating QueueSort plugins in that spot because we need to check across each individual profile("framework") to make sure there are no conflicts. The reason that check is in that spot is because that's after each framework/profile has been initialized. But, we could check the score weights on individual profiles as they're created. Does that make sense?

yes, I double-checked the code and I found I misunderstood the intention of function merge(). Thanks for your explanations, I'll push a commit tomorrow and some tests are needed too.

kerthcet commented 2 years ago

hi @damemi , I found it's easy to handle plugins in KubeSchedulerProfile.Plugins.Score.Enabled since all the plugins are implements of ScoreExtensionPoint , however , we can not make sure what extension point these plugins implemented in KubeSchedulerProfile.Plugins.MultiPoint.Enabled, for only score plugins using these weights. do you have any idea?

damemi commented 2 years ago

@kerthcet that is the challenge with this, and I'm open to suggestions on how we could implement it. You might have to look into doing a similar loop to try to cast all of the MultiPoint plugins to a ScorePlugin, and if that succeeds then check for the weights.

kerthcet commented 2 years ago

thanks, I'll have a try then.

damemi commented 2 years ago

@kerthcet any updates or progress on this?

kerthcet commented 2 years ago

Sorry, I just return from holidays and I talked with @NikhilSharmaWe , he is free to continue his work and I can provide some help if needed.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kerthcet commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kerthcet commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kerthcet commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

kerthcet commented 1 year ago

/remove-lifecycle rotten Still helpful

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kerthcet commented 1 year ago

/remove-lifecycle stale /kind help

k8s-ci-robot commented 1 year ago

@kerthcet: The label(s) kind/help cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes/kubernetes/issues/106426#issuecomment-1538065516): >/remove-lifecycle stale >/kind help Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 7 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

sanposhiho commented 7 months ago

/remove-lifecycle rotten

I guess @kerthcet is working on it.

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kerthcet commented 4 months ago

/remove-lifecycle stale

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten