Open h-w-chen opened 2 years ago
Assign to Hongwei as owner of Daemonset.
Can this be resolved by marking TP master as not schedulable?
this is caused by the way TP KCM started twice during kube-up.
discussed in team meeting. this bug is not a blocker. fix will be deferred.
this is the code how the TP master ended up in the scheduler nodes cache, in file cmd/kube-scheduler/app/options/options.go:
// if the resource provider kubeconfig is not set, default to the local cluster
if c.ComponentConfig.ResourceProviderKubeConfig == "" {
klog.V(2).Infof("ResourceProvider kubeConfig is not set. default to local cluster client")
c.NodeInformers = make(map[string]coreinformers.NodeInformer, 1)
c.NodeInformers["tp"] = c.InformerFactory.Core().V1().Nodes()
} else {
kubeConfigFiles, existed := genutils.ParseKubeConfigFiles(c.ComponentConfig.ResourceProviderKubeConfig)
// TODO: once the perf test env setup is improved so the order of TP, RP cluster is not required
// rewrite the IF block
if !existed {
klog.Warningf("ResourceProvider kubeConfig is not valid, default to local cluster kubeconfig file")
c.NodeInformers = make(map[string]coreinformers.NodeInformer, 1)
c.NodeInformers["rp0"] = c.InformerFactory.Core().V1().Nodes()
} else {
c.ResourceProviderClients = make(map[string]clientset.Interface, len(kubeConfigFiles))
c.NodeInformers = make(map[string]coreinformers.NodeInformer, len(kubeConfigFiles))
for i, kubeConfigFile := range kubeConfigFiles {
rpId := "rp" + strconv.Itoa(i)
c.ResourceProviderClients[rpId], err = clientutil.CreateClientFromKubeconfigFile(kubeConfigFile, "kube-scheduler")
if err != nil {
klog.Errorf("failed to create resource provider rest client, error: %v", err)
return nil, err
}
resourceInformerFactory := informers.NewSharedInformerFactory(c.ResourceProviderClients[rpId], 0)
c.NodeInformers[rpId] = resourceInformerFactory.Core().V1().Nodes()
klog.V(2).Infof("Created the node informer %p from resourceProvider kubeConfig %d %s",
c.NodeInformers[rpId].Informer(), i, kubeConfigFile)
}
}
}
What happened: in kube-up scale-out 1 TP x 1 RP x 1 worker env, when a daemonset is created, cluster has pod created for the TP master, besides the expected pods for RP master and worker, as illustrated:
What you expected to happen: only pods for RP master and RP worker are created
How to reproduce it (as minimally and precisely as possible): run kube-up script (of poc-2022-01-30) to start 1x1x1 scale-out cluster (besides the regular env vars for successful kube-up run)
run kubectl command to display the named pod
Anything else we need to know?: at TP master, in kube-controller-mnager.log, there were records of these pods' creations, including the extraneous one for TP master
Environment:
kubectl version
):cat /etc/os-release
): Container-Optimized OS from Google (TP master)uname -a
): 5.10.68+