Doc changes for parallel rebalancing

pinodeca commented 1 year ago

Why are we implementing it? (sales eng)

A rebalance operation carries out multiple shard moves in a sequential order by default. There are some cases customers may prefer to rebalance faster at the expense of using more resources such as network bandwidth. In those situations, customers are able to configure a rebalance operation to perform a number of shard moves in parallel.

What are the typical use cases?

Scaling out faster when adding new nodes to the cluster
Rebalancing the cluster faster to even out the utilization of nodes.

Communication goals (e.g. detailed howto vs orientation)

Good locations for content in docs structure

A new GUC is introduced as of 11.3.0 : citus.max_background_task_executors_per_node Default value is 1. This GUC should be mentioned in the list of GUCs.

Paralel rebalancing capability can be mentioned at https://docs.citusdata.com/en/v11.2/admin_guide/cluster_management.html#scaling-the-cluster section.

How does this work? (devs)

citus_rebalance_start udf schedules shard moves as background tasks by creating a task entry for every planned move in pg_catalog.pg_dist_background_task table. A task may have dependencies on other tasks which are defined in pg_catalog.pg_dist_background_task_depend table.

The background tasks are monitored and executed by the Citus Maintenance Deamon. The task monitor maintains the state of the tasks and makes them runnable when all the tasks they depend on are completed.

We added a new GUC, citus.max_background_task_executors_per_node, which determines how many shard moves from/to a node can be executed in parallel at a given time. A new column, nodes_involved, is added to pg_catalog.pg_dist_background_task table. This column is an array of [source_node_id, destination_node _id] for the corresponding shard move. The background task monitor uses this information to track the nodes involved in a shard move.

Example sql

Configure the number of parallel moves:

ALTER SYSTEM SET citus.max_background_task_executors_per_node = 2;
SELECT pg_reload_conf();

Start background rebalancer:

SELECT citus_rebalance_start();

Corner cases, gotchas

citus.max_background_task_executors value limits the number of parallel task executors in general.

Shards in the same colocation group will always move sequentially.

The shard moves are planned in an order. We respect that order when a node has outgoing shards before it has incoming shards. An incoming shard move will have a dependency on any outgoing shard move on the same node.

Are there relevant blog posts or outside documentation about the concept/feature?

Link to relevant commits and regression tests if applicable

PR https://github.com/citusdata/citus/pull/6756 : Schedule parallel shard moves in background rebalancer by removing task dependencies between shard moves across colocation groups.

PR https://github.com/citusdata/citus/pull/6771 Adds control for background task executors involving a node

pinodeca commented 1 year ago

@JelteF @emelsimsek @naisila Can you please fill in each of the template sections in the description above?

emelsimsek commented 1 year ago

I think the first version is ready. I will work with Joe to refine it.

citusdata / citus_docs