Closed tenzen-y closed 1 year ago
/assign @alculquicondor
do we have the same behavior in training-operator?
Question about coscheduling plugin: I suppose that if the minMembers increases, but pods were already scheduled, they wouldn't be disrupted. But new pods will be blocked from scheduling until the new minMembers threshold is reached. Correct?
do we have the same behavior in training-operator?
Yes, the training operator updates podGroups once schedulingPolicy is updated.
But new pods will be blocked from scheduling until the new minMembers threshold is reached. Correct?
Do new pods mean the case of increased worker replicas of MPIJob (scale out)?
Do new pods mean the case of increased worker replicas of MPIJob (scale out)? yes
/approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: alculquicondor
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Do new pods mean the case of increased worker replicas of MPIJob (scale out)? yes
Correct.
did you forget to push?
did you forget to push?
@alculquicondor Sorry. I forgot to push.
Previously, we implemented the logic to create and delete podGroups for the scheduler plugins in #538.
However, I forgot to implement the logic to update podGroups.
So, I implemented the logic. Also, I cleaned up the PodGroupCtrl interface so that we can control PodGroups without passing an unneeded namespace.