Open ad2001 opened 2 months ago
First of all, gang
means "all or nothing".
To achieve preemption when enablePreemptable
is true, the low priority job's minAvailable should be less than its total task number. This will ensure that there are tasks available to be preempted. The remaining tasks beyond minAvailable can be considered as "elastic" tasks, which can be preempted when needed.
The "elastic" tasks can be preempted by a single job at once or preempted by multiple jobs as required. This flexibility allows for efficient resource management and allocation based on demand.
Preempt the entire low-priority job is not supported now. But it has nothing to do with your situation. If you want low-priority job's all pods to be preempted when enablePreemptable
is true, set its minAvailable
to zero is ok.
Please provide an in-depth description of the question you have: When resource is constraint, I would like high priority jobs to preempt all pods from 1 or more low priority job(s) when
gang
,priority
andpreempt
(see config below).For example, I have 5 CPUs, and 1 running low-priority MPIJob that requires 1 launcher (1 CPU) and 2 workers (each needs 1 CPU) with
minAvailable
set to 3. At this point, only 2 CPU is left. When I submit the same MPIJob with high-priority, I would expect that all 3 pods created for the low-priority job to be evicted. However, volcano seems to only evict enough pod to fulfill theminAvailable
of the high-priority job. That means that 1 of the pods from the low-priority job is evicted while the other pods keep running. This behavior is happens whenenablePreemptable
is set tofalse
for thegang
plugin.If
enablePreemptable
is set totrue
forgang
, then none of the pods from the low-priority job is evicted because of the checks ofminAvailable
in thepreemptableFn
ingang
plugin here.Is this the expected behavior?
with job like
What do you think about this question?: I would like to be able to preempt the entire low-priority job when high priority job is in the queue in resource constraint cluster. Without such preemption behavior, jobs that need more resources can be starved by smaller, lower priority jobs that only need little resources.
Environment:
kubectl version
): 1.24.17uname -a
):gang
enabled to use volcano.