tikv / pd

Placement driver for TiKV
Apache License 2.0
1.06k stars 724 forks source link

[dr-autosync]region can't be balance after scale out tikvs in primary datacenter in async mode #7266

Open mayjiang0203 opened 1 year ago

mayjiang0203 commented 1 year ago

Bug Report

What did you do?

  1. down 2 tikvs in 2 zone in the backup data center, and make the replication mode switch to async mode;
  2. scale out 3 tikv for 3 zones in the primary data center;

What did you expect to see?

New scaled out tikv should balance with existing tikvs in primary data center.

What did you see instead?

image

image

What version of PD are you using (pd-server -V)?

sh-4.2# /tiup/deploy/pd-2379/bin/pd-server -V Release Version: v6.5.0-latest1025 Edition: Community Git Commit Hash: e0ab17c4c684cdffd87b20d0d8486bbc81a60b1d Git Branch: heads/refs/tags/v6.5.0-latest1025 UTC Build Time: 2023-10-25 05:05:51

mayjiang0203 commented 1 year ago

/severity major /assign @disksing

disksing commented 1 year ago

Due to PD's limit, when a region has down peer, it cannot be balanced. We can update placement rules to first remove down peer in dr zone then the regions can be balanced.

disksing commented 1 year ago

If we want to create new peer to replace the down peer, it is better to add new tikv in dr zone and change replication mode to Majority.