kubeflow / common

Common APIs and libraries shared by other Kubeflow operator repositories.
Apache License 2.0
51 stars 73 forks source link

Supplement more information about tfjob into podgroup when `enable-gang-scheduler` is set `true` in tf-operator #112

Open jiangkaihua opened 3 years ago

jiangkaihua commented 3 years ago

When enable-gang-scheduler=true, tf-operator will create CRD podgroup to permit gang scheduler volcano to allocate the pods. but when createing pod in func SyncPodGroup: https://github.com/kubeflow/common/blob/3fbe0ce982691279357e33e573a93d8ce4254584/pkg/controller.v1/common/job_controller.go#L211

    createPodGroup := &v1beta1.PodGroup{
        ObjectMeta: metav1.ObjectMeta{
            Name: job.GetName(),
            OwnerReferences: []metav1.OwnerReference{
                *jc.GenOwnerReference(job),
            },
        },
        Spec: v1beta1.PodGroupSpec{
            MinMember: minAvailable.IntVal,
        },
    }
    createdPodGroup, err := volcanoClientSet.SchedulingV1beta1().PodGroups(job.GetNamespace()).Create(createPodGroup)

Only pod infos and minMember is set in podgroup, which resulting to function missing, as well as unpredicatable bugs during allocation.

For example, since no minResources field is filled in podgroup, gang scheduler volcano cannot diff tfjobs from bestEffort jobs as both of the two jobs owns nil minResources, causing all tfjobs can be inqueue and action enqueue , reserve lose effort.

So in my opinion, we need to supplement more infos about tfjob into podgroup, such as minMember, queue as well as other fields, so as to make sure gang scheduler workers correctly.

gaocegege commented 3 years ago

SGTM.