pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.22k stars 493 forks source link

Google Cloud TiDB cluster deployment fails on kubernetes versions >1.9.7-gke.6 #129

Closed jaboatman closed 5 years ago

jaboatman commented 5 years ago

Following through the google-kubernetes-tutorial on a Kubernetes cluster running version 1.10.7-gke.6, everything works fine until the point of deploying the cluster:

helm install ./charts/tidb-cluster -n tidb --namespace=tidb --set pd.storageClassName=pd-ssd,tikv.storageClassName=pd-ssd

Then,

watch kubectl get pods --namespace tidb -o wide
NAME                              READY     STATUS      RESTARTS   AGE       IP            NODE
demo-monitor-5bc85fdb7f-cllxw     2/2       Running     0          5m        10.16.2.81    gke-xxxxx-default-pool-4f916ebe-lss5
demo-monitor-configurator-zssv6   0/1       Completed   0          5m        10.16.0.164   gke-xxxxx-default-pool-4f916ebe-1mlb
demo-pd-0                         0/1       Pending     0          5m        <none>        <none>

It hangs like this indefinitely waiting for the PVC to start. The pd-demo-pd-0 PVC is stuck in a "Pending" state with the message "waiting for first consumer to be created before binding".

This may be an issue with kubernetes itself.

gregwebs commented 5 years ago

We were just looking at this today. Can you remove volumeBindingMode: "WaitForFirstConsumer" from manifests/gke-storage.yml and apply it again?

jaboatman commented 5 years ago

Thanks for the quick response - Yes that seems to have worked, the cluster was able to start on 1.10.7-gke.6.

Is there any downside to not having volumeBindingMode: "WaitForFirstConsumer"?

gregwebs commented 5 years ago

I believe the setting is to avoid prematurely creating volumes. But I think it isn't needed here and we are going to remove it.

tennix commented 5 years ago

From the Kubernetes documentation:

For storage backends that are topology-constrained and not globally accessible from all Nodes in the cluster, PersistentVolumes will be bound or provisioned without knowledge of the Pod’s scheduling requirements. This may result in unschedulable Pods.

But the GCE pd we are using here is not topology-constrained, we are not using the zonal or regional GCE pd which needs topology awareness scheduling and WaitForFirstConsumer binding mode. If we use this binding mode, it may get stuck due to deadlock when scheduling (pod waits pv and pv waits pod).

I'll send a pr to fix this.