yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9k stars 1.07k forks source link

[DocDB] yb-admin command may return messy results during leader blacklisting completion check #12395

Open qvad opened 2 years ago

qvad commented 2 years ago

Jira Link: DB-1509

Description

Not reproduced every time, got this on master universe with following flags

        instance_type='c5.xlarge',
        master_gflags={
            "load_balancer_max_concurrent_moves_per_table": "1",
            "load_balancer_max_concurrent_moves": "1",
            "ysql_num_shards_per_tserver": "50",
            "yb_num_shards_per_tserver": "50",
        },
        tserver_gflags={
            "ysql_num_shards_per_tserver": "50",
            "yb_num_shards_per_tserver": "50",
        }

Note that looks like 2509 is correct number, but sometimes we observe 3523 and other values. Command always evaluated on same node

yb-admin get_leader_blacklist_completion --master_addresses 172.151.27.112,172.151.31.78,172.151.29.81 2>&1
Percent complete = 48.5054 : 1292 remaining out of 2509
...
yb-admin get_leader_blacklist_completion --master_addresses 172.151.27.112,172.151.31.78,172.151.29.81 2>&1
Percent complete = 0 : 3523 remaining out of 3523
...
yb-admin get_leader_blacklist_completion --master_addresses 172.151.27.112,172.151.31.78,172.151.29.81 2>&1
Percent complete = 0 : 2509 remaining out of 2509
...
yb-admin get_leader_blacklist_completion --master_addresses 172.151.27.112,172.151.31.78,172.151.29.81 2>&1
Percent complete = 100 : 0 remaining out of 2509

Logs are to massive to be attached via GH

bmatican commented 2 years ago

@lingamsandeep Might be a good starter task to investigate and get used to setting up a cluster, yb-admin tooling, master endpoints & LBing, etc.