kubernetes-sigs / kueue

Kubernetes-native Job Queueing
https://kueue.sigs.k8s.io
Apache License 2.0
1.27k stars 223 forks source link

Support Volcano queue in Kueue #1985

Open AllenXu93 opened 3 months ago

AllenXu93 commented 3 months ago

We have used volcano queue for quota schedule https://volcano.sh/en/docs/queue/ , but volcano is schedule for pod. We want to use kueue limit Pod's creating, to reduce APIServer's prestress. Is there any way to use kueue and vocalno queue?

googs1025 commented 2 months ago

hi! Is there any more information you can provide? Does it mean that you want to use kueue queue to support the volcano scheduler?

AllenXu93 commented 2 months ago

hi! Is there any more information you can provide? Does it mean that you want to use kueue queue to support the volcano scheduler?

Hi. We have already use volcano queue in our cluster, we plan to use kueue for job's queue schedule. A simple way is to install kueue directly, but in cluster there are many CR about queue, like kuque's ClusterQueue LocalQueue, and volcano's queue. Any update to quota need to update both of them. So I want to know, is there any plan to let kueue support other queue or quota configuration, like volcano queue or k8s's resourceQuota. For example, Kueue's schedule and queue manager support queue's interface, we can add some plugin that provider other queue configuration source.

KunWuLuan commented 2 months ago

We have the similar requirement, we use elastic quota in our environment and we need a controller to transfer between different quota systems. Maybe we can consider to create a controller and a framework to help create and control the quota systems. @alculquicondor HDYT? And I can help to build the controller.

tenzen-y commented 2 months ago

We had a similar discussion about ResourceQuota here: https://github.com/kubernetes-sigs/kueue/issues/696. As a result, cluster admins should implement admission check controllers for the ResourceQuota by themselves. @AllenXu93 @KunWuLuan So, I guess that you can select the same approach for the volcano queue and ElasticQuota.

AllenXu93 commented 2 months ago

We had a similar discussion about ResourceQuota here: #696. As a result, cluster admins should implement admission check controllers for the ResourceQuota by themselves. @AllenXu93 @KunWuLuan So, I guess that you can select the same approach for the volcano queue and ElasticQuota.

Thanks for your advice! select admission check can help in volcano queue or ElasticQuota, but we still need to maintain multiple CR, for example I want to change the quota of queue cq-a, I need to modify both clusterQueue and volcano queue(or elasticQuota) CR. I don't know is there any plan to extend queue source, so that we can write queue plugin.

KunWuLuan commented 2 months ago

@tenzen-y

As a result, cluster admins should implement admission check controllers for the ResourceQuota by themselves.

Thanks, I think maintaining two types of quota with the same semantics is a challenging task, so maybe admission check is not what I need.

tenzen-y commented 2 months ago

We had a similar discussion about ResourceQuota here: #696. As a result, cluster admins should implement admission check controllers for the ResourceQuota by themselves. @AllenXu93 @KunWuLuan So, I guess that you can select the same approach for the volcano queue and ElasticQuota.

Thanks for your advice! select admission check can help in volcano queue or ElasticQuota, but we still need to maintain multiple CR, for example I want to change the quota of queue cq-a, I need to modify both clusterQueue and volcano queue(or elasticQuota) CR. I don't know is there any plan to extend queue source, so that we can write queue plugin.

Yes, your understanding is correct. In that situation, we need to maintain multiple customresources to maintain quotas. As we mentioned in the ResourceQuota discussion, ideal world, we should manage all quotas by kueue, and the admission check controllers for other quota management systems are interim approaches. So, you could avoid double management after you remove dependencies for the volcano.

tenzen-y commented 2 months ago

@tenzen-y

As a result, cluster admins should implement admission check controllers for the ResourceQuota by themselves.

Thanks, I think maintaining two types of quota with the same semantics is a challenging task, so maybe admission check is not what I need.

As I mentioned above, the admission check controller for the other quota management systems is an interim approach. So, we would recommend fully migrating to kueue.

alculquicondor commented 2 months ago

Maybe we can consider to create a controller and a framework to help create and control the quota systems. @alculquicondor HDYT? And I can help to build the controller.

It's something you can explore, but I wouldn't support having volcano APIs as dependencies of Kueue.

I have heard of people using kueue+volcano, but I think the approach they take is to leave quota management fully to Kueue, and use volcano only for the "gang-scheduling" capability.

AllenXu93 commented 2 months ago

Maybe we can consider to create a controller and a framework to help create and control the quota systems. @alculquicondor HDYT? And I can help to build the controller.

It's something you can explore, but I wouldn't support having volcano APIs as dependencies of Kueue.

I have heard of people using kueue+volcano, but I think the approach they take is to leave quota management fully to Kueue, and use volcano only for the "gang-scheduling" capability.

Yes, I prefer to use all in one queue quota like kueue's queue quota, pod schedule should not aware of quota. But the problem is kueue can't confirm admited job's pods scheduled (either volcano or default schedule) .

tenzen-y commented 2 months ago

Maybe we can consider to create a controller and a framework to help create and control the quota systems. @alculquicondor HDYT? And I can help to build the controller.

It's something you can explore, but I wouldn't support having volcano APIs as dependencies of Kueue. I have heard of people using kueue+volcano, but I think the approach they take is to leave quota management fully to Kueue, and use volcano only for the "gang-scheduling" capability.

Yes, I prefer to use all in one queue quota like kueue's queue quota, pod schedule should not aware of quota. But the problem is kueue can't confirm admited job's pods scheduled (either volcano or default schedule) .

If we enable the waitForPodsReady, the kueue will re-queue the pending jobs. Please see docs for more details: https://kueue.sigs.k8s.io/docs/tasks/manage/setup_sequential_admission/