yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9.06k stars 1.08k forks source link

Test: fault tolerance at master side #8645

Open ttyusupov opened 3 years ago

ttyusupov commented 3 years ago

Jira Link: DB-2359 From https://phabricator.dev.yugabyte.com/D8253#243515: " We should also consider some unhappy paths:

We'll have some tablets in CREATING state, which have their table_id set, but the split does not get retried. For a timeout, they're also not added to the TableInfo object at all. For the failover case, I believe they'll automatically be added to the tablet, using AddTablet (in the tablet loader code). "

bmatican commented 2 years ago

@SrivastavaAnubhav can you chime in if we have coverage by now for all of these?

SrivastavaAnubhav commented 2 years ago

These should be handled by automatic tablet splitting, if it's enabled (but it's not enabled by default yet).

bmatican commented 2 years ago

@SrivastavaAnubhav oh, are you saying that if automatic is turned off, but a user does manual splits, we might not be re-triggering splits after master failover, if neeed? If so, might be worth discussing in next week's sync, as that seems a bit off.

SrivastavaAnubhav commented 2 years ago

Yeah, we should have some common method that gets called on restart.