yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.88k stars 1.05k forks source link

[DocDB] Remove initial load balancer delay for new master leaders #16287

Open hulien22 opened 1 year ago

hulien22 commented 1 year ago

Jira Link: DB-5713

Description

Currently we have load_balancer_initial_delay_secs (default of 120s) which delays the LB from starting upon a new master being elected, but this should not be needed anymore with recent LB improvements/hardening.

From @bmatican (https://yugabyte.slack.com/archives/C4K1838GL/p1677788431192569?thread_ts=1677714909.572269&cid=C4K1838GL):

[this was a] workaround to ensure we don’t start LBing before we get all necessary heartbeats But since then, I believe we’ve added some LB defense, to stop LBing, if during tablet map analysis, we detect any TS in raft groups, that we don’t have heartbeats from — I suspect that’s a good enough protection that we might be able to remove/relax that 120s?

mdbridge commented 1 year ago

I searched our code base for load_balancer_initial_delay_secs, and found a declared Google flag but no actual uses as far as I can tell except in tests.

Thus, I think this flag does nothing and the issue is already "done".

Shall I deprecate this flag using?

// In order to mark a flag as deprecated, use this macro:
//   DEPRECATE_FLAG(int32, foo_flag, "10_2022")
// This will print a warning at startup if a process has been configured with the specified flag or
// any time the flags are updated at runtime with this flag.
//
// The third argument is a date in the format of "MM_YYYY" to make it easy to track when a flag was
// deprecated, so we may fully remove declarations in future releases.
// Flags are set to a dummy value
#define DEPRECATE_FLAG(type, name, date_mm_yyyy) \
hulien22 commented 1 year ago

@mdbridge see src/yb/master/catalog_manager_bg_tasks.cc:

      // Do the LB enabling check
      if (!processed_tablets) {
        if (catalog_manager_->TimeSinceElectedLeader() >
            MonoDelta::FromSeconds(FLAGS_load_balancer_initial_delay_secs)) {
          catalog_manager_->load_balance_policy_->RunLoadBalancer();
        }
      }
mdbridge commented 1 year ago

opps. A voice-o caused my search to be restricted to word boundaries. :-( Thanks, @hulien22.

rahuldesirazu commented 1 year ago

One problem to consider: It is possible that for a given tablet, that all its peers have heartbeated in. The LB will work on this tablet, even if the other tservers haven't heartbeated in as yet.

To address this, TabletLoader logic can loop through the each tablet's committed consensus and register tservers in the consensus as dead. As each tserver comes up, catalog manager will mark it as alive. In the bg thread, check the liveness of all tservers. There are two situations:

  1. All the tservers are alive (they all heartbeated in). It is safe to enable LB.
  2. One tserver is slow or dead and not heartbeating in. Then use the current logic of waiting for 120s before starting LB.

Some experiments/metrics will need to be checked to ensure this doesn't slow down the loading phase too much.

mdbridge commented 1 year ago

(the discussion thread that led to Rahul's proposal discussion thread that led to Rahul's proposal: https://yugabyte.slack.com/archives/C03ULJYE0KG/p1680627506605999)

mdbridge commented 1 year ago

@rahuldesirazu can you please create a GitHub issue to implement your improved load balancer defense and make this issue depend on that one? Probably assign to @Sandeep for triaging.

@Sandeep I'm assigning this to you for triaging as it's no longer an eligible starter task.