Open zhifanggao opened 3 months ago
The preempt in the case works without gang plugin
You have disable preemption in gang plugin from your config.
Hi, please adjust log level to 5 and paste the logs.
the log level is already 5, logs referred to vcjob2 are all here. I think the GANG scheduler return "all node are unavailable. " the scheduler session is closed
Can you paste your vcjobs and queue yaml?
@lowang-bh It does not matter wether enablePreemptable is true or false. I put the gang in the last line of plugins. The preemption works well
@Monokaix
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
annotations:
meta.helm.sh/release-name: low
meta.helm.sh/release-namespace: preempt
volcano.sh/preemptable: "true"
creationTimestamp: "2024-07-09T06:50:30Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
name: vc-job1
namespace: preempt
resourceVersion: "87091459"
uid: 8ec099ae-470c-48b4-afdd-af4c4371049b
spec:
maxRetry: 3
minAvailable: 1
policies:
- action: RestartJob
event: PodEvicted
queue: test-kyuubi
schedulerName: volcano
tasks:
- maxRetry: 3
minAvailable: 1
name: job1
policies:
- action: CompleteJob
event: TaskCompleted
replicas: 1
template:
metadata:
annotations:
volcano.sh/preemptable: "true"
spec:
containers:
- command:
- sleep
- 10m
image: nginx:latest
imagePullPolicy: IfNotPresent
name: nginx
resources:
limits:
cpu: "32"
requests:
cpu: "32"
restartPolicy: OnFailure
vcjob3
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
annotations:
meta.helm.sh/release-name: high
meta.helm.sh/release-namespace: preempt
volcano.sh/Preemptable: "true"
creationTimestamp: "2024-07-09T06:50:41Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
name: vc-job3
namespace: preempt
resourceVersion: "87088972"
uid: cc16806a-14b5-429d-97f3-136c82c0ba5e
spec:
maxRetry: 3
minAvailable: 1
policies:
- action: RestartJob
event: PodEvicted
priorityClassName: system-cluster-critical
queue: test-kyuubi
schedulerName: volcano
tasks:
- maxRetry: 3
minAvailable: 1
name: job3
policies:
- action: CompleteJob
event: TaskCompleted
replicas: 1
template:
metadata:
annotations:
volcano.sh/preemptable: "true"
spec:
containers:
- command:
- sleep
- 10m
image: nginx:latest
name: nginx
resources:
limits:
cpu: "32"
requests:
cpu: "32"
priorityClassName: system-cluster-critical
restartPolicy: OnFailure
queue
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
creationTimestamp: "2024-06-05T03:28:14Z"
generation: 2
name: test-kyuubi
resourceVersion: "87091460"
uid: a764d42f-f8fb-49f5-b281-9b9076bb6973
spec:
capability:
cpu: "32"
memory: 40960000Mi
reclaimable: true
weight: 1
What happened: The expected preemption did not happened What you expected to happen: the higher priority jobs can preempt the lower priority jobs when enable gang plugin How to reproduce it (as minimally and precisely as possible):
volcano-scheduler.conf
create a queue test-kyuubi with capacity, 4 cpu, 512 memory
helm chart install lower vcjob1 , with 2 cpu requests
helm chart install higher vcjob2, with 3 cpu requests
Anything else we need to know?: logs:
Environment:
kubectl version
): 1.23.6uname -a
):