pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.22k stars 489 forks source link

Proposal: Parallelize control-loops of same TiDB cluster #1245

Open aylei opened 4 years ago

aylei commented 4 years ago

Feature Request

Is your feature request related to a problem? Please describe: We've already parallelize our control-loops between different TiDB clusters, the workQueue ensures that different workers won't sync a same TidbCluster at the same time. However, for a specific TiDB cluster, all reconcile functions run in sequential, which increases risk of subsequent operations being blocked by failure or lag of one step. For example, it may takes a relatively long time to rolling-update TiKV, and the operator cannot perform failover for tidb-servers at that time because of the synchronization.

Describe the feature you'd like:

Try to break tidb_cluster_controller into several sub-controllers that runs in parallel.

Teachability, Documentation, Adoption, Migration Strategy:

This is an notable change and it is safer to target it in v1.2.0

Yisaer commented 4 years ago

Currently we reconcile all the components in a sequential way. Parallelizing control-loops is a nice choice.

Just one idea:

We could dividing current loops into serveral parallelize loops by type of component. Pd / TiKV / TiDB could have their own loops and reconciled in parallel. And the service , statefulset and status of them would still reconcile in their own loops sequentially.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 15 days

DanielZhangQD commented 3 years ago

At least upgrading or scaling of one component should not block the scaling of the other components.