pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.22k stars 493 forks source link

Tidb-scheduler scheduling error when based on zone #2638

Closed PengJi closed 4 years ago

PengJi commented 4 years ago

Bug Report

What version of Kubernetes are you using?

Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:26:26Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.9-49+bb32c46d83322c", GitCommit:"bb32c46d83322cb1c90306f7170444575d43efe1", GitTreeState:"clean", BuildDate:"2020-05-08T03:44:59Z", GoVersion:"go1.13.7", Compiler:"gc", Platform:"linux/amd64"}

What version of TiDB Operator are you using? 1.1

What did you do?

Creating tc, pods in tidb cluster is scheduled based on zone(topologyKey=zone), there are 3 zones and 3 replicas, so, pods should be scheduled in 3 zones.

What did you expect to see? Pods deployed three zones as follows:

inf-blade-zebra-tikv-0  xr-hulk-k8s-ep-test206
inf-blade-zebra-tikv-1  gh-hulk-k8s-ep-test465
inf-blade-zebra-tikv-2  zf-hulk-k8s-ep-test400

xr-hulk-k8s-ep-test206 is in zone xr, gh-hulk-k8s-ep-test465 is in zone gh, zf-hulk-k8s-ep-test400 is in zone zf

What did you see instead? Pods deployed two zones as follows:

inf-blade-zebra-tikv-0  xr-hulk-k8s-ep-test206
inf-blade-zebra-tikv-1  gh-hulk-k8s-ep-test465
inf-blade-zebra-tikv-2  gh-hulk-k8s-ep-test400

gh-hulk-k8s-ep-test465 and gh-hulk-k8s-ep-test400 are in zone gh, xr-hulk-k8s-ep-test206 is in zone xr

For more information, see the documentation: https://shimo.im/docs/T36g3WghYTTtKDkX

cofyc commented 4 years ago

maybe there is a race in the code, can you investigate it?

PengJi commented 4 years ago

@cofyc This error occurs mainly in the following cases. Supposing zone1 has two hosts: host1 and Host2, one pod is scheduled to host1, when another pod is scheduled, nodes is the parameters of scheduler-extender func (h * HA) Filter(instanceName String, pod * apiv1.pod, nodes [] apiv1.node) (Error) , nodes may not contain host1. According to the current algorithm, the scheduler thinks that zone1 don't have scheduled pods, so it will continue to schedule to zone1, resulting in scheduling error