volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.09k stars 948 forks source link

supoort preemptionPolicy #3642

Open hwdef opened 1 month ago

hwdef commented 1 month ago

What would you like to be added:

priorityclass has a field named preemptionPolicy. https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#non-preempting-priority-class

volcano didn't implement its logic.

solution

In volcano's cache, priorityclas is just a number, so we need to modify the priorityclass in the cache and then implement preemptionpolicy-related logic.

https://github.com/volcano-sh/volcano/blob/c91e42e07b55fbc0686095aedb0f9436b1991163/pkg/scheduler/api/job_info.go#L121

googs1025 commented 1 month ago

/cc

Monokaix commented 1 month ago

/good-first-issue

volcano-sh-bot commented 1 month ago

@Monokaix: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/volcano-sh/volcano/issues/3642): >/good-first-issue Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
googs1025 commented 1 month ago

/assign

googs1025 commented 1 month ago

I will find time to research and try to implement

Sianao commented 1 month ago

/assign

googs1025 commented 1 month ago

@Sianao Hello, I am currently working on this issue. Can you find other issues? Or do you have any special requirements for this issue?

Sianao commented 1 month ago

@Sianao Hello, I am currently working on this issue. Can you find other issues? Or do you have any special requirements for this issue?

I unassign it already

googs1025 commented 2 weeks ago

@Sianao Do you want to try this issue? I'm currently busy with other projects and don't have time to do this. :) If you want, feel free to take it. I will review and help in the PR /unassign

Monokaix commented 1 week ago

/assign @Sianao

Sianao commented 1 week ago

/unassign

Sianao commented 1 week ago

Sir , I m busy , too .

JesseStutler commented 6 days ago

I'm glad to work on this.

JesseStutler commented 6 days ago

/assign

JesseStutler commented 5 days ago

@hwdef Hi, Do we only need to support preemptionPolicy at the pod level? Not to add more fields in job_info. So that we only need to add some logic in preempting, like this: image If the preemptionPolicy is 'Never', not allowing the pod to preempt others, directly skip this pod.

Monokaix commented 5 days ago

We should also consider this case: https://github.com/volcano-sh/volcano/issues/3512 @hwdef @JesseStutler

lowang-bh commented 4 days ago

I think we can keep the property of priority as a number, and add extra info about preemptionPolicy in taskInfo or JobInfor when parse it.

JesseStutler commented 4 days ago

@lowang-bh I think get the preeemptionPolicy from pod spec in taskInfo is enough? K8s default scheduler only use the preemptionPolicy https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go#L240-L242 here to judge whether the preemptionPolicy is Never. So there is no need to add extra field I think if we only consider implementing preemptionPolicy at the pod level.

There is the scenario we need to make sure that is whether we need to support preemptionPolicy at the vcjob level(Put field into JobInfo), because vcjob also has priorityClass field. But when vc-controller creates the pod, the PriorityClass in the pod spec was not set to the same value as vcjob, just use the priorityClass specified in the template, we need to discuss how to implement this, or we don't need to implement preemptionPolicy at the vcjob level. @hwdef

hwdef commented 4 days ago

You are right. If you only implement preemptionpolicy for pods, you do not need to add other fields. But I think it's better implement preemptionPolicy for vcjob.

Monokaix commented 4 days ago

You are right. If you only implement preemptionpolicy for pods, you do not need to add other fields. But I think it's better implement preemptionPolicy for vcjob.

What I'm concern is that do we really have a use case for supporting preemptionPolicy for vcjob, becuase the field is just derived from kube-scheduler and is just for pod before: )

JesseStutler commented 4 days ago

If we support preemptionPolicy for vcjob, becasuse preemptionPolicy field comes from priorityclass, I think all pods in vcjob should set the same PriorityClassName with vcjob. In kube-apiserver, there is a plugin called Priority will set the pod's preemptionPolicy field: https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/admission/priority/admission.go#L193-L200 So we only need to modify vc-controller to set the pod template's PriorityClassName when creating pod, scheduler only need to add logic in preempt and reclaim to skip if the PreemtionPolicy is Never, like this:

if task.Pod.Spec.PreemptionPolicy != nil && *task.Pod.Spec.PreemptionPolicy == v1.PreemptNever {
            klog.V(3).Infof("Task %s/%s is not eligible to preempt other tasks due to preemptionPolicy is Never", task.Namespace, task.Name)
            ...
        }
hwdef commented 3 days ago

You are right. If you only implement preemptionpolicy for pods, you do not need to add other fields. But I think it's better implement preemptionPolicy for vcjob.

What I'm concern is that do we really have a use case for supporting preemptionPolicy for vcjob, becuase the field is just derived from kube-scheduler and is just for pod before: )

I think the answer is yes, we currently have such a scenario. I implemented it through annotation.