func (c *controller) PatrollerDo() {
// step1: get pList
pList, err := c.podLister.List(util.GetSuperClusterListerLabelsSelector())
for _, cluster := range clusterNames {
// step2: get vList
vList := &corev1.PodList{}
if err := c.MultiClusterController.List(cluster, vList); err != nil {
// ...
}
}
}
Here, we get the pList and vList cache in two steps with different time. As the number of clusters become more, there will be a greater difference. Some pPods will not be visible when Pods created during Checker run. This will trigger the pod force delete logic.
// pPod not found and vPod still exists, the pPod may be deleted manually or by controller pod eviction.
// If the vPod has not been bound yet, we can create pPod again.
// If the vPod has been bound, we'd better delete the vPod since the new pPod may have a different nodename.
if isPodScheduled(vPod) {
c.forceDeleteVPod(vObj.GetOwnerCluster(), vPod, false)
return
}
Meanwhile, Pod DWS syncer will trigger deletion of pPod.
What did you expect to happen:
When finding that a vPod has been bound without pPod, we should double check whether the pPod exists before the pod force deletion .
What steps did you take and what happened:
When there are a large number of VCs in the cluster, vPods are unexpectedly deleted by the Pod Checker accidentally. The logic that triggers deletion is in https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/pkg/syncer/resources/pod/checker.go#L313.
Here, we get the pList and vList cache in two steps with different time. As the number of clusters become more, there will be a greater difference. Some pPods will not be visible when Pods created during Checker run. This will trigger the pod force delete logic.
Meanwhile, Pod DWS syncer will trigger deletion of pPod.
What did you expect to happen:
When finding that a vPod has been bound without pPod, we should double check whether the pPod exists before the pod force deletion .
/kind bug