Closed kirinse closed 5 years ago
@kirinse Thank you for your attention.
We add an extended scheduler to K8s, i have to check its logs.
Could you please provide these outputs:
1, The PV list:
kubectl get pv
2, tidb-scheduler logs:
kubectl get po -n tidb-admin
NAME READY STATUS RESTARTS AGE
tidb-controller-manager-bcc66f746-t4tsq 1/1 Running 0 1h
tidb-scheduler-5b85b688c6-wrvbg 2/2 Running 0 1h
kubectl logs -f -n tidb-admin tidb-scheduler-5b85b688c6-wrvbg -c tidb-scheduler
replace tidb-scheduler-5b85b688c6-wrvbg
with your real pod name.
@weekface thank you for your quick response.
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE local-pv-1767facf 238476Gi RWO Delete Available local-storage 36m local-pv-1dbd65bc 238476Gi RWO Delete Available local-storage 36m local-pv-37c274de 238476Gi RWO Retain Bound tidb/pd-demo-pd-1 local-storage 36m local-pv-45324aa3 238476Gi RWO Retain Bound tidb/tikv-demo-tikv-1 local-storage 36m local-pv-62446ab1 238476Gi RWO Delete Available local-storage 36m local-pv-67e0e52d 238476Gi RWO Delete Available local-storage 36m local-pv-7e1a02ed 238476Gi RWO Delete Available local-storage 36m local-pv-820ea0a0 238476Gi RWO Retain Bound tidb/pd-demo-pd-2 local-storage 36m local-pv-8a0a2eb0 238476Gi RWO Delete Available local-storage 36m local-pv-9371219a 238476Gi RWO Retain Bound tidb/pd-demo-pd-0 local-storage 36m local-pv-bf3146fc 238476Gi RWO Delete Available local-storage 36m local-pv-c4277489 238476Gi RWO Delete Available local-storage 36m local-pv-cfa833c6 238476Gi RWO Retain Bound tidb/tikv-demo-tikv-2 local-storage 36m local-pv-f1f39fe7 238476Gi RWO Retain Bound tidb/tikv-demo-tikv-0 local-storage 36m local-pv-f2cc9d77 238476Gi RWO Delete Available local-storage 36m
kubectl logs -f -n tidb-admin tidb-scheduler-5b85b688c6-wrvbg -c tidb-scheduler
I1107 09:29:33.750286 1 version.go:37] Welcome to TiDB Operator. I1107 09:29:33.750406 1 version.go:38] Git Commit Hash: b779ae6f111f341802a85b4be2d524b7ed605331 I1107 09:29:33.750411 1 version.go:39] UTC Build Time: 2018-11-06 04:04:11 I1107 09:29:33.752248 1 mux.go:60] start scheduler extender server, listening on 0.0.0.0:10262 I1107 09:29:50.999869 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:29:51.082799 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:51.086261 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:29:51.091025 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:52.089737 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:29:52.177854 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:52.181548 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:29:52.198180 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:54.186310 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:29:54.276168 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:54.279738 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:29:54.284608 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:58.281417 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:29:58.376853 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:29:58.382925 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:29:58.386798 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:30:06.384973 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:30:06.479468 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:30:06.484521 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:30:06.489510 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:30:22.486357 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:30:22.490099 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:30:22.576954 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:30:22.580687 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:30:54.497027 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:30:54.500387 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:30:54.585628 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:30:54.590169 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:31:54.679235 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:31:54.686643 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:31:54.691308 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:31:54.695033 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:31:55.693478 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:31:55.779798 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:31:55.785466 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:31:55.791430 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:31:57.786711 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 E1107 09:31:57.878974 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 09:31:57.884592 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 E1107 09:31:57.900293 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node
Sorry, can you provide these outputs also:
kubectl get no
kubectl get po -n tidb -owide
kubectl get no
NAME STATUS ROLES AGE VERSION kube-master Ready master 1h v1.10.5 kube-node-1 Ready <none> 1h v1.10.5 kube-node-2 Ready <none> 1h v1.10.5 kube-node-3 Ready <none> 1h v1.10.5
kubectl get po -n tidb -o wide
NAME READY STATUS RESTARTS AGE IP NODE demo-monitor-5bc85fdb7f-n4vj7 2/2 Running 4 1h 10.244.2.11 kube-node-2 demo-monitor-configurator-sn5hb 0/1 Completed 1 1h 10.244.1.6 kube-node-3 demo-pd-0 1/1 Running 10 1h 10.244.3.20 kube-node-1 demo-pd-1 1/1 Running 10 1h 10.244.1.14 kube-node-3 demo-pd-2 0/1 Pending 0 1h <none> <none> demo-tidb-0 0/1 Running 16 1h 10.244.1.16 kube-node-3 demo-tidb-1 0/1 CrashLoopBackOff 16 1h 10.244.3.17 kube-node-1 demo-tikv-0 1/2 CrashLoopBackOff 20 1h 10.244.2.14 kube-node-2 demo-tikv-1 1/2 CrashLoopBackOff 20 1h 10.244.3.18 kube-node-1 demo-tikv-2 0/2 Pending 0 1h <none> <none>
Oh, the default log level is 2, please modify the log level to 4 and get the tidb-scheduler's logs again:
kubectl edit deploy -n pingcap tidb-scheduler
...
containers:
- command:
- /usr/local/bin/tidb-scheduler
- -v=2
- -port=10262
image: localhost:5000/pingcap/tidb-operator:latest
imagePullPolicy: Always
name: tidb-scheduler
...
and change - -v=2
to - -v=4
Could you provide the output of kubectl get pv
?
@weekface
kubectl logs -f -n tidb-admin tidb-scheduler-68c6b47498-cqd8w -c tidb-scheduler
I1107 10:35:01.588200 1 version.go:37] Welcome to TiDB Operator. I1107 10:35:01.588390 1 version.go:38] Git Commit Hash: b779ae6f111f341802a85b4be2d524b7ed605331 I1107 10:35:01.588397 1 version.go:39] UTC Build Time: 2018-11-06 04:04:11 I1107 10:35:01.589756 1 mux.go:60] start scheduler extender server, listening on 0.0.0.0:10262 I1107 10:36:02.735488 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:36:02.735542 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:36:02.755197 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:02.757923 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:36:02.757959 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:36:02.761698 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:03.833986 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:36:03.834059 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:36:03.839983 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:03.845109 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:36:03.845149 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:36:03.848358 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:05.848968 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:36:05.849026 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:36:05.853956 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:05.857275 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:36:05.857314 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:36:05.934393 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:09.859208 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:36:09.859252 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:36:09.864846 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:09.941271 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:36:09.941322 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:36:09.948397 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:17.872355 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:36:17.872395 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:36:17.877219 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:17.956285 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:36:17.956336 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:36:17.960351 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:33.883130 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:36:33.883193 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:36:33.934535 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:36:33.967279 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:36:33.967338 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:36:33.971626 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:37:05.940401 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:37:05.940443 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:37:05.943939 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:37:06.035090 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:37:06.035164 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:37:06.042428 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:05.951983 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:38:05.952028 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:38:05.956685 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:06.051733 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:38:06.051785 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:38:06.056060 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:06.961878 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:38:06.961921 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:38:06.968359 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:07.062084 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:38:07.062135 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:38:07.066808 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:08.974982 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:38:08.975033 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:38:08.979332 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:09.071876 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:38:09.071921 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:38:09.077528 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:12.985025 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:38:12.985071 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:38:12.989279 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:13.083948 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:38:13.084008 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:38:13.091092 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:20.994732 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:38:20.994775 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:38:21.037100 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:21.099360 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:38:21.099411 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:38:21.103540 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:37.043035 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:38:37.043079 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:38:37.047936 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:38:37.138102 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:38:37.138148 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:38:37.145989 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:39:09.055704 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:39:09.055747 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:39:09.060804 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:39:09.153606 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:39:09.153660 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:39:09.156774 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:09.068262 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:40:09.068314 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:40:09.141687 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:09.163422 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:40:09.163473 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:40:09.167276 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:10.147831 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:40:10.147875 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:40:10.153196 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:10.245557 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:40:10.245598 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:40:10.248539 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:12.247602 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:40:12.247655 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:40:12.253328 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:12.257506 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:40:12.257553 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:40:12.345764 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:16.259222 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:40:16.259313 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:40:16.341097 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:16.351384 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:40:16.351430 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:40:16.354883 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:24.347358 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:40:24.347404 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:40:24.351971 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:24.441833 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:40:24.441899 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:40:24.449016 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:40.356755 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:40:40.356800 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:40:40.360186 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:40:40.456533 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:40:40.456654 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:40:40.462478 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:41:12.365479 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:41:12.365543 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:41:12.441869 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:41:12.467554 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:41:12.467644 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:41:12.471118 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:12.449723 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:42:12.449803 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:42:12.544695 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:12.555344 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:42:12.555392 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:42:12.561499 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:13.549852 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:42:13.549895 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:42:13.643538 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:13.650638 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:42:13.650798 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:42:13.656557 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:15.647575 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:42:15.647619 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:42:15.651273 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:15.743731 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:42:15.743788 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:42:15.748409 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:19.748460 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:42:19.748548 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:42:19.754403 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:19.844387 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:42:19.844459 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:42:19.849864 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:27.844806 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:42:27.844854 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:42:27.849480 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:27.856960 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:42:27.856975 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:42:27.860572 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:43.855968 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:42:43.856112 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:42:43.861569 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:42:43.868215 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:42:43.868313 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:42:43.947077 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:43:15.869805 1 scheduler.go:76] scheduling pod: tidb/demo-tikv-2 I1107 10:43:15.869913 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-2] E1107 10:43:15.946481 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node I1107 10:43:15.954218 1 scheduler.go:76] scheduling pod: tidb/demo-pd-2 I1107 10:43:15.954262 1 scheduler.go:79] entering predicate: HighAvailability, nodes: [kube-node-1] E1107 10:43:15.959137 1 mux.go:106] unable to filter nodes: the first 3 pods can't be scheduled to the same node
@tennix
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv-1767facf 238476Gi RWO Delete Available local-storage 2h
local-pv-1dbd65bc 238476Gi RWO Delete Available local-storage 2h
local-pv-37c274de 238476Gi RWO Retain Bound tidb/pd-demo-pd-1 local-storage 2h
local-pv-45324aa3 238476Gi RWO Retain Bound tidb/tikv-demo-tikv-1 local-storage 2h
local-pv-62446ab1 238476Gi RWO Delete Available local-storage 2h
local-pv-67e0e52d 238476Gi RWO Delete Available local-storage 2h
local-pv-7e1a02ed 238476Gi RWO Delete Available local-storage 2h
local-pv-820ea0a0 238476Gi RWO Retain Bound tidb/pd-demo-pd-2 local-storage 2h
local-pv-8a0a2eb0 238476Gi RWO Delete Available local-storage 2h
local-pv-9371219a 238476Gi RWO Retain Bound tidb/pd-demo-pd-0 local-storage 2h
local-pv-bf3146fc 238476Gi RWO Delete Available local-storage 2h
local-pv-c4277489 238476Gi RWO Delete Available local-storage 2h
local-pv-cfa833c6 238476Gi RWO Retain Bound tidb/tikv-demo-tikv-2 local-storage 2h
local-pv-f1f39fe7 238476Gi RWO Retain Bound tidb/tikv-demo-tikv-0 local-storage 2h
local-pv-f2cc9d77 238476Gi RWO Delete Available local-storage 2h
The logs indicate that K8s is trying to schedule the tidb/demo-pd-2
Pod to kube-node-1
, but the tidb/demo-pd-0
Pod has been scheduled to kube-node-1
.
TiDB operator doesn't allow the first 3 pods to be scheduled to the same node. For example, pd-0
and pd-2
can't be scheduled to the same kube-node-1
node.
Yes, this is for data safety reasons. If two PD/TiKV pods scheduled on the same node and that node is down, then TiDB is out of service because of two replicas lost.
The logs indicate that K8s is trying to schedule the
tidb/demo-pd-2
Pod tokube-node-1
, but thetidb/demo-pd-0
Pod has been scheduled tokube-node-1
.TiDB operator don't allow the first 3 pods can't be scheduled to the same node.
Ok. i just cloned this repo, didn't change anything. How to fix it?
This may be an issue of our recently added scheduler. If the scheduler works correctly, the third PD pod's PV should not bound on the same node as previous PD pod. We'll diagnose it further.
@tennix thanks. also, i met other problem is manifests/local-dind/dind-cluster-v1.10.sh L1847, helm returned error:
Error: unknown flag: --template
My helm client version is
Client: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
You need to upgrade your helm client. v2.8.2 doesn't support --template
flag.
@tennix OK, then you guys need update local-dind-tutorial.md :)
Oh, yeah. The document only requires v2.8.2 or later. Could you help us update the document? Thanks!
Someone reported a bug to K8s: https://github.com/kubernetes/kubernetes/issues/65131, much like this issue.
And this is the fix PR: https://github.com/kubernetes/kubernetes/pull/67556, fixed on v1.12
This causes issues when trying to evaluate future pods with pod affinity/anti-affinity because the pod has not been assumed while the volumes have been decided.
@kirinse @tennix
Oh, yeah. The document only requires v2.8.2 or later. Could you help us update the document? Thanks!
sent pr #175
Great, thank you for your contribution!
After fetch updates, now i can't see any demo-tikv/demo-tidb pods.
kubectl get no
NAME STATUS ROLES AGE VERSION kube-master Ready master 3h v1.10.5 kube-node-1 Ready <none> 3h v1.10.5 kube-node-2 Ready <none> 3h v1.10.5 kube-node-3 Ready <none> 3h v1.10.5
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE local-pv-1767facf 238476Gi RWO Retain Bound tidb/pd-demo-pd-0 local-storage 3h local-pv-1dbd65bc 238476Gi RWO Delete Available local-storage 3h local-pv-37c274de 238476Gi RWO Delete Available local-storage 3h local-pv-45324aa3 238476Gi RWO Delete Available local-storage 3h local-pv-62446ab1 238476Gi RWO Delete Available local-storage 3h local-pv-67e0e52d 238476Gi RWO Delete Available local-storage 3h local-pv-7e1a02ed 238476Gi RWO Delete Available local-storage 3h local-pv-820ea0a0 238476Gi RWO Delete Available local-storage 3h local-pv-8a0a2eb0 238476Gi RWO Delete Available local-storage 3h local-pv-9371219a 238476Gi RWO Delete Available local-storage 3h local-pv-bf3146fc 238476Gi RWO Delete Available local-storage 3h local-pv-c4277489 238476Gi RWO Delete Available local-storage 3h local-pv-cfa833c6 238476Gi RWO Delete Available local-storage 3h local-pv-f1f39fe7 238476Gi RWO Delete Available local-storage 3h local-pv-f2cc9d77 238476Gi RWO Retain Bound tidb/pd-demo-pd-1 local-storage 3h
kubectl logs -f -n tidb-admin tidb-scheduler-5b85b688c6-dwpc8 -c tidb-scheduler
I1109 04:24:10.884319 1 version.go:37] Welcome to TiDB Operator. I1109 04:24:10.885626 1 version.go:38] Git Commit Hash: c5a835d545856ebfa853a291d95b5d1cd99ab8b9 I1109 04:24:10.885662 1 version.go:39] UTC Build Time: 2018-11-09 03:43:33 I1109 04:24:10.937456 1 mux.go:60] start scheduler extender server, listening on 0.0.0.0:10262 I1109 04:39:26.632577 1 scheduler.go:76] scheduling pod: tidb/demo-pd-0 I1109 04:39:27.692360 1 scheduler.go:76] scheduling pod: tidb/demo-pd-0 I1109 04:40:12.290369 1 scheduler.go:76] scheduling pod: tidb/demo-pd-1 I1109 04:40:13.346263 1 scheduler.go:76] scheduling pod: tidb/demo-pd-1
get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system etcd-kube-master 1/1 Running 0 3h kube-system kube-apiserver-kube-master 1/1 Running 0 3h kube-system kube-controller-manager-kube-master 1/1 Running 0 3h kube-system kube-dns-64d6979467-6sv55 2/3 CrashLoopBackOff 7 3h kube-system kube-flannel-ds-amd64-7827d 1/1 Running 0 3h kube-system kube-flannel-ds-amd64-8nggc 1/1 Running 0 3h kube-system kube-flannel-ds-amd64-8qtpr 1/1 Running 0 3h kube-system kube-flannel-ds-amd64-mqwxg 1/1 Running 0 3h kube-system kube-proxy-2vz85 1/1 Running 0 3h kube-system kube-proxy-hd4cx 1/1 Running 0 3h kube-system kube-proxy-mhq8t 1/1 Running 0 3h kube-system kube-proxy-v5jzp 1/1 Running 0 3h kube-system kube-scheduler-kube-master 1/1 Running 0 3h kube-system kubernetes-dashboard-68ddc89549-hstqk 1/1 Running 0 3h kube-system local-volume-provisioner-n88lh 1/1 Running 0 3h kube-system local-volume-provisioner-p7xdd 1/1 Running 0 3h kube-system local-volume-provisioner-pjb2l 1/1 Running 0 3h kube-system registry-proxy-966lw 1/1 Running 0 3h kube-system registry-proxy-cll4r 1/1 Running 0 3h kube-system registry-proxy-dh524 1/1 Running 0 3h kube-system registry-proxy-tn8j7 1/1 Running 0 3h kube-system tiller-deploy-6fd8d857bc-g2k4h 1/1 Running 0 3h tidb-admin tidb-controller-manager-bcc66f746-p5sth 1/1 Running 0 30m tidb-admin tidb-scheduler-5b85b688c6-dwpc8 2/2 Running 0 30m tidb demo-monitor-644d69f7db-z7pv4 2/2 Running 0 14m tidb demo-monitor-configurator-xbcjv 0/1 Completed 0 14m tidb demo-pd-0 1/1 Running 0 14m tidb demo-pd-1 1/1 Running 4 13m tidb demo-tidb-initializer-npjw4 1/1 Running 0 14m
We're investigating this issue which may be the bug of kubernetes itself. From what you provided, demo-pd-1 is already scheduled, it's just failed. Could you provide the log of demo-pd-1 by kubectl logs -n tidb demo-pd-1
?
@tennix
kubectl logs -n tidb demo-pd-1
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
nslookup domain demo-pd-1.demo-pd-peer.tidb.svc failed
Name: demo-pd-1.demo-pd-peer.tidb.svc Address 1: 10.244.1.6 demo-pd-1.demo-pd-peer.tidb.svc.cluster.local nslookup domain demo-pd-1.demo-pd-peer.tidb.svc success pd cluster is not ready now: demo-pd.tidb.svc
Did you just delete the previous cluster by helm delete tidb-cluster --purge
without cleaning the pvc? You can remove the previous PVC following the guide here and then creating a new cluster again.
actually
manifests/local-dind/dind-cluster-v1.10.sh stop
manifests/local-dind/dind-cluster-v1.10.sh clean
sudo rm -rf data/kube-node-*
manifests/local-dind/dind-cluster-v1.10.sh up
...
Ah, OK. What about demo-pd-0's log? The pd-2's log shows that pd cluster is not ready. The pd-1 must wait until pd-0 ready.
kubectl logs -n tidb demo-pd-0
nslookup domain demo-pd-0.demo-pd-peer.tidb.svc failed
nslookup domain demo-pd-0.demo-pd-peer.tidb.svc failed
nslookup domain demo-pd-0.demo-pd-peer.tidb.svc failed
Name: demo-pd-0.demo-pd-peer.tidb.svc
Address 1: 10.244.2.9 demo-pd-0.demo-pd-peer.tidb.svc.cluster.local
nslookup domain demo-pd-0.demo-pd-peer.tidb.svc success
starting pd-server ...
/pd-server --data-dir=/var/lib/pd --name=demo-pd-0 --peer-urls=http://0.0.0.0:2380 --advertise-peer-urls=http://demo-pd-0.demo-pd-peer.tidb.svc:2380 --client-urls=http://0.0.0.0:2379 --advertise-client-urls=http://demo-pd-0.demo-pd-peer.tidb.svc:2379 --config=/etc/pd/pd.toml --initial-cluster=demo-pd-0=http://demo-pd-0.demo-pd-peer.tidb.svc:2380
2018/11/09 05:03:05.821 util.go:62: [info] Welcome to Placement Driver (PD).
2018/11/09 05:03:05.821 util.go:63: [info] Release Version: v2.0.5
2018/11/09 05:03:05.821 util.go:64: [info] Git Commit Hash: b64716707b7279a4ae822be767085ff17b5f3fea
2018/11/09 05:03:05.821 util.go:65: [info] Git Branch: release-2.0
2018/11/09 05:03:05.821 util.go:66: [info] UTC Build Time: 2018-09-07 12:34:46
2018/11/09 05:03:05.821 metricutil.go:83: [info] disable Prometheus push client
2018/11/09 05:03:05.822 server.go:96: [info] PD config - Config({FlagSet:0xc00019cd20 Version:false ClientUrls:http://0.0.0.0:2379 PeerUrls:http://0.0.0.0:2380 AdvertiseClientUrls:http://demo-pd-0.demo-pd-peer.tidb.svc:2379 AdvertisePeerUrls:http://demo-pd-0.demo-pd-peer.tidb.svc:2380 Name:demo-pd-0 DataDir:/var/lib/pd InitialCluster:demo-pd-0=http://demo-pd-0.demo-pd-peer.tidb.svc:2380 InitialClusterState:new Join: LeaderLease:3 Log:{Level:info Format:text DisableTimestamp:false File:{Filename: LogRotate:true MaxSize:0 MaxDays:0 MaxBackups:0}} LogFileDeprecated: LogLevelDeprecated: TsoSaveInterval:3s Metric:{PushJob:demo-pd-0 PushAddress: PushInterval:15s} Schedule:{MaxSnapshotCount:3 MaxPendingPeerCount:16 MaxMergeRegionSize:0 SplitMergeInterval:1h0m0s PatrolRegionInterval:100ms MaxStoreDownTime:1h0m0s LeaderScheduleLimit:4 RegionScheduleLimit:4 ReplicaScheduleLimit:8 MergeScheduleLimit:8 TolerantSizeRatio:5 LowSpaceRatio:0.8 HighSpaceRatio:0.6 EnableRaftLearner:false Schedulers:[{Type:balance-region Args:[] Disable:false} {Type:balance-leader Args:[] Disable:false} {Type:hot-region Args:[] Disable:false} {Type:label Args:[] Disable:false}]} Replication:{MaxReplicas:3 LocationLabels:[zone rack host]} Namespace:map[] QuotaBackendBytes:0 AutoCompactionRetention:1 TickInterval:500ms ElectionInterval:3s Security:{CAPath: CertPath: KeyPath:} LabelProperty:map[] configFile:/etc/pd/pd.toml WarningMsgs:[] NamespaceClassifier:default nextRetryDelay:1000000000 disableStrictReconfigCheck:false heartbeatStreamBindInterval:{Duration:60000000000} leaderPriorityCheckInterval:{Duration:60000000000}})
2018/11/09 05:03:05.826 server.go:122: [info] start embed etcd
2018/11/09 05:03:05.826 log.go:86: [info] embed: [listening for peers on http://0.0.0.0:2380]
2018/11/09 05:03:05.826 log.go:86: [info] embed: [pprof is enabled under /debug/pprof]
2018/11/09 05:03:05.826 log.go:86: [info] embed: [listening for client requests on 0.0.0.0:2379]
2018/11/09 05:03:05.826 systime_mon.go:24: [info] start system time monitor
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [name = demo-pd-0]
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [data dir = /var/lib/pd]
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [member dir = /var/lib/pd/member]
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [heartbeat = 500ms]
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [election = 3000ms]
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [snapshot count = 100000]
2018/11/09 05:03:05.878 log.go:86: [info] etcdserver: [advertise client URLs = http://demo-pd-0.demo-pd-peer.tidb.svc:2379]
2018/11/09 05:03:05.886 log.go:86: [info] etcdserver: [restarting member b798afebf07ff3aa in cluster 2bcb211fc6dc76bd at commit index 43]
2018/11/09 05:03:05.886 log.go:86: [info] raft: [b798afebf07ff3aa became follower at term 282]
2018/11/09 05:03:05.886 log.go:86: [info] raft: [newRaft b798afebf07ff3aa [peers: [], term: 282, commit: 43, applied: 0, lastindex: 44, lastterm: 2]]
2018/11/09 05:03:05.889 log.go:82: [warning] auth: [simple token is not cryptographically signed]
2018/11/09 05:03:05.891 log.go:86: [info] etcdserver: [starting server... [version: 3.2.18, cluster version: to_be_decided]]
2018/11/09 05:03:05.892 log.go:86: [info] etcdserver/membership: [added member b798afebf07ff3aa [http://demo-pd-0.demo-pd-peer.tidb.svc:2380] to cluster 2bcb211fc6dc76bd]
2018/11/09 05:03:05.892 log.go:84: [info] etcdserver/membership: [set the initial cluster version to 3.2]
2018/11/09 05:03:05.892 log.go:86: [info] etcdserver/api: [enabled capabilities for version 3.2]
2018/11/09 05:03:05.893 log.go:86: [info] etcdserver/membership: [added member 4b38fa5a0dce5ba5 [http://demo-pd-1.demo-pd-peer.tidb.svc:2380] to cluster 2bcb211fc6dc76bd]
2018/11/09 05:03:05.893 log.go:86: [info] rafthttp: [starting peer 4b38fa5a0dce5ba5...]
2018/11/09 05:03:05.893 log.go:86: [info] rafthttp: [started HTTP pipelining with peer 4b38fa5a0dce5ba5]
2018/11/09 05:03:05.895 log.go:86: [info] rafthttp: [started streaming with peer 4b38fa5a0dce5ba5 (writer)]
2018/11/09 05:03:05.896 log.go:86: [info] rafthttp: [started streaming with peer 4b38fa5a0dce5ba5 (writer)]
2018/11/09 05:03:05.896 log.go:86: [info] rafthttp: [started peer 4b38fa5a0dce5ba5]
2018/11/09 05:03:05.896 log.go:86: [info] rafthttp: [started streaming with peer 4b38fa5a0dce5ba5 (stream MsgApp v2 reader)]
2018/11/09 05:03:05.896 log.go:86: [info] rafthttp: [added peer 4b38fa5a0dce5ba5]
2018/11/09 05:03:05.896 log.go:86: [info] rafthttp: [started streaming with peer 4b38fa5a0dce5ba5 (stream Message reader)]
2018/11/09 05:03:10.889 log.go:86: [info] raft: [b798afebf07ff3aa is starting a new election at term 282]
2018/11/09 05:03:10.889 log.go:86: [info] raft: [b798afebf07ff3aa became candidate at term 283]
2018/11/09 05:03:10.889 log.go:86: [info] raft: [b798afebf07ff3aa received MsgVoteResp from b798afebf07ff3aa at term 283]
2018/11/09 05:03:10.889 log.go:86: [info] raft: [b798afebf07ff3aa [logterm: 2, index: 44] sent MsgVote request to 4b38fa5a0dce5ba5 at term 283]
2018/11/09 05:03:10.896 log.go:82: [warning] rafthttp: [health check for peer 4b38fa5a0dce5ba5 could not connect:
I see the kube-dns
is CrashLoopBackOff
, what is the kube-dns
log?
kube-system kube-dns-64d6979467-6sv55 2/3 CrashLoopBackOff 7 3h
kubectl logs -f kube-dns-64d6979467-6sv55 -n kube-system -c kubedns
I1109 04:56:45.750242 1 dns.go:48] version: 1.14.8 I1109 04:56:45.751468 1 server.go:71] Using configuration read from directory: /kube-dns-config with period 10s I1109 04:56:45.751510 1 server.go:119] FLAG: --alsologtostderr="false" I1109 04:56:45.751515 1 server.go:119] FLAG: --config-dir="/kube-dns-config" I1109 04:56:45.751518 1 server.go:119] FLAG: --config-map="" I1109 04:56:45.751519 1 server.go:119] FLAG: --config-map-namespace="kube-system" I1109 04:56:45.751521 1 server.go:119] FLAG: --config-period="10s" I1109 04:56:45.751590 1 server.go:119] FLAG: --dns-bind-address="0.0.0.0" I1109 04:56:45.751644 1 server.go:119] FLAG: --dns-port="10053" I1109 04:56:45.751649 1 server.go:119] FLAG: --domain="cluster.local." I1109 04:56:45.751652 1 server.go:119] FLAG: --federations="" I1109 04:56:45.751658 1 server.go:119] FLAG: --healthz-port="8081" I1109 04:56:45.751660 1 server.go:119] FLAG: --initial-sync-timeout="1m0s" I1109 04:56:45.751772 1 server.go:119] FLAG: --kube-master-url="" I1109 04:56:45.751819 1 server.go:119] FLAG: --kubecfg-file="" I1109 04:56:45.751838 1 server.go:119] FLAG: --log-backtrace-at=":0" I1109 04:56:45.751842 1 server.go:119] FLAG: --log-dir="" I1109 04:56:45.751844 1 server.go:119] FLAG: --log-flush-frequency="5s" I1109 04:56:45.751860 1 server.go:119] FLAG: --logtostderr="true" I1109 04:56:45.751862 1 server.go:119] FLAG: --nameservers="" I1109 04:56:45.751991 1 server.go:119] FLAG: --stderrthreshold="2" I1109 04:56:45.752009 1 server.go:119] FLAG: --v="2" I1109 04:56:45.752146 1 server.go:119] FLAG: --version="false" I1109 04:56:45.752167 1 server.go:119] FLAG: --vmodule="" I1109 04:56:45.752511 1 server.go:201] Starting SkyDNS server (0.0.0.0:10053) I1109 04:56:45.753077 1 server.go:220] Skydns metrics enabled (/metrics:10055) I1109 04:56:45.753108 1 dns.go:146] Starting endpointsController I1109 04:56:45.753113 1 dns.go:149] Starting serviceController I1109 04:56:45.753320 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0] I1109 04:56:45.753423 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0] I1109 04:56:46.254556 1 dns.go:170] Initialized services and endpoints from apiserver I1109 04:56:46.254667 1 server.go:135] Setting up Healthz Handler (/readiness) I1109 04:56:46.254689 1 server.go:140] Setting up cache handler (/cache) I1109 04:56:46.254698 1 server.go:126] Status HTTP port 8081 I1109 04:58:25.145183 1 server.go:160] Ignoring signal terminated (can only be terminated by SIGKILL)
This is mainly caused by DNS unstable when the cluster was bootstrapping. So it should be another issue, we should create another issue to track this one.
Seems the DNS problem is related to #126, it's a known bug and fixed in latest version of PD.
Are we waiting for a new release of PD?
We can run with the current version of PD but it's not very stable especially when bootstrapping with an unstable network setup. This mostly happens in DinD environment.
How should @kirinse fix his issue? Can he upgrade to 2.1?
The PD fix commit https://github.com/pingcap/pd/pull/1279 hasn't cherry-picked in v2.1.0-rc.4, so currently only latest version should fix this issue.
@tennix can you recommend a set of image tags for @kirinse to use?
Currently, there's no versioned tag contains this fix except for latest
tag.
that's for PD. For tidb & tikv I think you would want to use the more stable tag v2.1.0-rc.4
So, no solution for now?
@kirinse We're sorry about these issues, these are related to upstream programs so they're a bit slower to get fixed in tidb-operator. But there are some workarounds you can try now.
For the scheduling issue, you can delete and recreate the cluster. If you're lucky, all the pods can be scheduled correctly. If not, then there is another workaround: set schedulerName
to default
in charts/tidb-cluster/values.yaml. This disables HA scheduling, multiple PD pods or TiKV pods may be scheduled to the same node, but I think it's ok for DinD test.
For the PD pods bootstrap error, you can try the latest
PD Docker image.
@tennix this github issue is long and now includes three separate issues. Can we also create a separate issue for the scheduler issue?
@kirinse here is the config change to use latest pd. The PD fix is going into RC5, which should be available soon.
diff --git a/charts/tidb-cluster/values.yaml b/charts/tidb-cluster/values.yaml
index b5cf2c8..a31dd1b 100644
--- a/charts/tidb-cluster/values.yaml
+++ b/charts/tidb-cluster/values.yaml
@@ -30,3 +30,3 @@ pd:
replicas: 3
- image: pingcap/pd:v2.0.7
+ image: pingcap/pd:latest
logLevel: info
@@ -72,3 +72,3 @@ tikv:
replicas: 3
- image: pingcap/tikv:v2.0.7
+ image: pingcap/tikv:v2.1.0-rc.4
logLevel: info
@@ -120,3 +120,3 @@ tidb:
# password: "admin"
- image: pingcap/tidb:v2.0.7
+ image: pingcap/tidb:v2.1.0-rc.4
# Image pull policy.
rc5 is available for all three components now.
followed instruction local-dind-tutorial.md
kubectl describe pod demo-pd-2 -n tidb