yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9.04k stars 1.08k forks source link

[xCluster] DDL Replication - Handle failovers when DDLs are being replicated #23957

Open hulien22 opened 2 months ago

hulien22 commented 2 months ago

Jira Link: DB-12857 A few scenarios to handle here:

hari90 commented 1 month ago

Solution:

  1. Pause replication on the group.
  2. Wait for all Pollers to get the update and stop. 2.1. Dor DDL queue table we need to run all pending DDLs.
  3. Flush tablet safe times to table
  4. Compute namespace level safe time
  5. PITR the database to the safe time 5.1 No need to PITR the metadata since that is logically at the xCluster safe time.

Also TODO: For DDL queue table send safe time in every batch (ignore the 250ms buffer).