kubernetes-sigs / kueue

Kubernetes-native Job Queueing
https://kueue.sigs.k8s.io
Apache License 2.0
1.38k stars 249 forks source link

Partial Admission Preemption Panic #2799

Closed gabesaba closed 1 month ago

gabesaba commented 2 months ago

What happened: While refactoring Preemptor.GetTargets, I encountered a panic. Furthermore, I discovered that we are not exercising the PartialAdmission preemption logic in any of our tests, and it may be broken and panicking generally.

This test exits here, as we find a preemption target above. The PartialAdmission preemption path is not tested.

In this test, we exit here. We panic when calling assignment.TotalRequestsFor, as this map is empty

How to reproduce it (as minimally and precisely as possible):

Move this line to the top of Preemptor.GetTargets and observe a panic during this test https://github.com/kubernetes-sigs/kueue/blob/00111d9aada59e1a6d8d90b9847cf7451bc5be76/pkg/scheduler/preemption/preemption.go#L116

Delete these lines and observe that unit and integration tests still pass https://github.com/kubernetes-sigs/kueue/blob/00111d9aada59e1a6d8d90b9847cf7451bc5be76/pkg/scheduler/scheduler.go#L477-L481

gabesaba commented 2 months ago

cc @mimowo @mbobrovskyi

mbobrovskyi commented 2 months ago

/assign @trasc