hi,
In the function “ deletePodsAndServices”,we clean the services by delete all the services of pod including the worker pod. it will be better to avoid deleting unexisting worker services.
func (pc *PyTorchController) deletePodsAndServices(job *pyv1.PyTorchJob, pods []*v1.Pod) error {
if len(pods) == 0 {
return nil
}
// Delete nothing when the cleanPodPolicy is None or Running.
if *job.Spec.CleanPodPolicy == common.CleanPodPolicyNone ||
*job.Spec.CleanPodPolicy == common.CleanPodPolicyRunning {
return nil
}
for _, pod := range pods {
if err := pc.PodControl.DeletePod(pod.Namespace, pod.Name, job); err != nil {
return err
}
// Pod and service have the same name, thus the service could be deleted using pod's name.
if err := pc.ServiceControl.DeleteService(pod.Namespace, pod.Name, job); err != nil {
return err
}
}
return nil
}
hi, In the function “ deletePodsAndServices”,we clean the services by delete all the services of pod including the worker pod. it will be better to avoid deleting unexisting worker services.
reference:
https://github.com/kubeflow/pytorch-operator/issues/228