[docdb] Master: move away from the tablet replica map to using the heartbeat consensus state directly

bmatican commented 4 years ago

Jira Link: DB-2145 We've seen a number of scenarios where the master in memory state ends up being different than the TS side consensus information. The master state should technically be updated with the correct info from the TS side, via heartbeats. However, in practice, the master has an in-memory cache, on top of the already serialized consensus information, which can sometimes be wrongly updated (TBD example tasks)

We could try to switch away from using this extra in-memory cache all together and simply rely on the TabletInfo committed consensus state, which is already backed by the copy-on-write objects in the master!

The one caveat right now seems to be the transient tablet state, based on bootstrap information (ie: NOT_STARTED / BOOTSTRAPPING / RUNNING), which we use to determine if the load balancer can or should trigger another remote bootstrap. One option here could be to piggyback on the consensus information, still, and just change the semantics of what is a running vs starting tablet on the load balancer side. We could change to reflect that just PRE_VOTER members of the config are deemed starting, where as once a member is promoted to VOTER, it is considered running.

cc @rahuldesirazu @hectorgcr @nspiegelberg

bmatican commented 4 years ago

This might be useful: https://github.com/apache/kudu/commit/5f7823fe7f94edeb0f8dbb2b9d7a2201614e5e16

bmatican commented 2 years ago

@lingamsandeep Note: this behavior is the one I was referencing as part of the auto scale work, that might be an issue with master tracking tablet peer state, technically via the consensus state, but practically updating it on each heartbeat from each peer, into an in-memory cached map.

yugabyte / yugabyte-db

[docdb] Master: move away from the tablet replica map to using the heartbeat consensus state directly #5305