volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.12k stars 953 forks source link

Is it possible that volcano in physical k8s also support vcluster k8s scheduling? #3316

Open haoziwu opened 8 months ago

haoziwu commented 8 months ago

What would you like to be added:

 I would like to suggest that volcano not only support pods scheduling in  physical k8s, but also in vcluster k8s, where only one volcano-scheduler is deployed in physical k8s. 

Why is this needed:

Now there are many platforms like AI or BigData constructed by k8s vcluster technology. If volcano does not support physcial and vcluster co-scheduling, every vcluster k8s and the physical k8s would deploy the volcano project. It seems very redundant, and it is likely to manage resources repeatedly. So I suggest to add some features to support this scenario that only one volcano-scheduler in physical k8s to schedule all pods not only in physical k8s but also in vcluster k8s.

Monokaix commented 8 months ago

Hi, can you describe more about vcluster? If vcluster's nodes and pods can be discovered volcano, then one scheudler can schedule both cluster's resurces.

haoziwu commented 7 months ago

@Monokaix hello, the description of vcluster is here https://www.vcluster.com/, the vcluser kubernetes is a guest kubernetes on host physical kubernetes. This guest kuberenetes only has a api-sever actually, its nodes are fake nodes, which makes use of host physical kuberenetes nodes. So if you do not deploy volcano in guest kuberenetes, guest kuberenetes will not recogize PodGroup or Queue CRD and so on, if you deploy volcano in every guest kubernetes, it seems to be very redundant.

haoziwu commented 7 months ago

@Monokaix hello, the description of vcluster is here https://www.vcluster.com/, the vcluser kubernetes is a guest kubernetes on host physical kubernetes. This guest kuberenetes only has a api-sever actually, its nodes are fake nodes, which makes use of host physical kuberenetes nodes. So if you do not deploy volcano in guest kuberenetes, guest kuberenetes will not recogize PodGroup or Queue CRD and so on, if you deploy volcano in every guest kubernetes, it seems to be very redundant.

@Monokaix That is to say, if we do not change some features of volcano for vcluster scenario, we need to deploy all volcano components in every guest kubernetes, i think there is no need to deploy some component repeatedly especiall volcano-scheduler, scheduler should be an overall concept, should be only in host physical kubernetes.

Monokaix commented 7 months ago

https://www.vcluster.com/,

I'm not so familiar with vcluster, but according to the doc you mentioned, seems that vcluster has shields the existence of the underlying cluster and exists in the form of namespace. We need to clarify whether vcluster is one-level scheduling or two-level scheduling.

haoziwu commented 7 months ago

https://www.vcluster.com/,

I'm not so familiar with vcluster, but according to the doc you mentioned, seems that vcluster has shields the existence of the underlying cluster and exists in the form of namespace. We need to clarify whether vcluster is one-level scheduling or two-level scheduling.

vcluster can have two ways to schedule pods in vcluster kubernetes: 1. pods scheduled by vcluster scheduler, then synced to host physical; 2. pods synced to host physical, then scheduled by host physical scheduler. Method 1 should deploy all volcano components in every vcluster kubernetes, which i think very redandunt.

Monokaix commented 7 months ago

@Monokaix hello, the description of vcluster is here https://www.vcluster.com/, the vcluser kubernetes is a guest kubernetes on host physical kubernetes. This guest kuberenetes only has a api-sever actually, its nodes are fake nodes, which makes use of host physical kuberenetes nodes. So if you do not deploy volcano in guest kuberenetes, guest kuberenetes will not recogize PodGroup or Queue CRD and so on, if you deploy volcano in every guest kubernetes, it seems to be very redundant.

@Monokaix That is to say, if we do not change some features of volcano for vcluster scenario, we need to deploy all volcano components in every guest kubernetes, i think there is no need to deploy some component repeatedly especiall volcano-scheduler, scheduler should be an overall concept, should be only in host physical kubernetes.

How do other components like istio create CRD in each cluster? Maybe volcano can learn from it.

kmadel commented 3 months ago

vcluster can have two ways to schedule pods in vcluster kubernetes: 1. pods scheduled by vcluster scheduler, then synced to host physical; 2. pods synced to host physical, then scheduled by host physical scheduler. Method 1 should deploy all volcano components in every vcluster kubernetes, which i think very redandunt.

While method 1 seems redundant for some use cases, it is a valuable way to quickly and efficiently test new versions of schedulers like Volcano without having to spin another physical cluster. I have tested method 1 and it works great. I have not tested method 2, but there is no reason it should not work. If a K8s Job specs schedulerName: volcano then that will be what gets syncing to the vCluster host cluster and scheduled accordingly. If you are using labels with something like KubeRay RayCluster or RayJob then you need to make sure those are not renamed by the vCluster syncer - via the syncLabels config; see https://www.vcluster.com/docs/vcluster/configure/vcluster-yaml/experimental/sync-settings#syncSettings-syncLabels