Closed alculquicondor closed 2 months ago
/assign @vladikkuzn
To clarify, this counter should increment for every workload that is preempted.
In this case we can just extend
and add an additional label for the preemption scope.
Yes, indeed, that would be useful.
But this counter is from the point-of-view of the preemptee CQ.
The request is from the point-of-view of the preemptor CQ.
... that is a bit different , so count the preemptees but group but group by the preemptor's CQ name. We could ad yet another metric label "preemptor_cluster_queue" but we can end up creating too many metric data-points.
Preemption is one of the few actions that involves two entities. We could also have one metric that has both clusterqueues as labels, but that could cause explosion of cardinality. Having one for each side sounds like a reasonable compromise.
What would you like to be added:
A metric that counts how many preemptions a ClusterQueue has issued, broken down by whether it was internal to the ClusterQueue, it was a reclamation, fair sharing or priority threshold.
This is somewhat the opposite direction of
evicted_workloads_total
, but focused on Preemption.Why is this needed:
Improve observability.
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.