Open bmatican opened 4 years ago
This might be useful: https://github.com/apache/kudu/commit/5f7823fe7f94edeb0f8dbb2b9d7a2201614e5e16
@lingamsandeep Note: this behavior is the one I was referencing as part of the auto scale work, that might be an issue with master tracking tablet peer state, technically via the consensus state, but practically updating it on each heartbeat from each peer, into an in-memory cached map.
Jira Link: DB-2145 We've seen a number of scenarios where the master in memory state ends up being different than the TS side consensus information. The master state should technically be updated with the correct info from the TS side, via heartbeats. However, in practice, the master has an in-memory cache, on top of the already serialized consensus information, which can sometimes be wrongly updated (TBD example tasks)
We could try to switch away from using this extra in-memory cache all together and simply rely on the TabletInfo committed consensus state, which is already backed by the copy-on-write objects in the master!
The one caveat right now seems to be the transient tablet state, based on bootstrap information (ie: NOT_STARTED / BOOTSTRAPPING / RUNNING), which we use to determine if the load balancer can or should trigger another remote bootstrap. One option here could be to piggyback on the consensus information, still, and just change the semantics of what is a
running
vsstarting
tablet on the load balancer side. We could change to reflect that just PRE_VOTER members of the config are deemedstarting
, where as once a member is promoted to VOTER, it is considered running.cc @rahuldesirazu @hectorgcr @nspiegelberg