Closed palaevren closed 3 years ago
Hi,
This is the expected behavior. See http://docs.citusdata.com/en/v10.2/admin_guide/cluster_management.html?highlight=rebalance#rebalance-shards-without-downtime for the non-blocking rebalancer you need to use Azure Hyperscale(Citus) or Citus enterprise.
Hi,
I have a distributed table (which is also partitioned by daily range), let's say table_a. Replication factor is 2 for the table.
I recently realized that shardstate of a placement is 3, as well as its all daily partitions (table_a_p2021_09_26_XXX, table_a_p2021_09_27_XXX, etc.).
In order to repair the corrupted shard from its healty replica, I run citus_copy_shard_placement function:
citus_copy_shard_placement(XXX, 'node1', 5432, 'node2', 5432);
It takes a while as the shard contains many rows. However, as long as the function runs, COPY commands loading data to other tables get locked.
When I monitor blocked and blocking processes as described here, I see results like below:
blocking statement: select_copy_shard_placement($1, $2, $3, $4, $5); blocked statement: COPY table_b (...) FROM .... .....
blocking statement: select_copy_shard_placement($1, $2, $3, $4, $5); blocked statement: COPY table_c (...) FROM .... ...