yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.66k stars 1.04k forks source link

[DocDB][PITR] Duplicate schema version: 3 [{3, { columns: [{ id: 1 ..}] tserver error during PITR #21196

Open shishir2001-yb opened 4 months ago

shishir2001-yb commented 4 months ago

Jira Link: DB-10127

Description

Tried on version: 2.21.1.0-b124

Logs: https://drive.google.com/file/d/15ZRoz1VHWiKI4fbhMVnQsj0NzJVHwmQM/view?usp=sharing Encountered the following Tserver error while running PITR in parallel with Cross-DB-DDLs sample app. Find the whole error in logs.

E0223 06:42:19.956969 35822 schema_packing.cc:423] Duplicate schema version: 3,
 [{3, { columns: [{ id: 1 num_varlen_columns_before: 0 offset_after_prev_varlen_column: 0 size: 0 nullable: 1 data_type: STRING },
{ id: 2 num_varlen_columns_before: 1 offset_after_prev_varlen_column: 0 size: 0 nullable: 1 data_type: INT32 },
 { id: 3 num_varlen_columns_before: 2 offset_after_prev_varlen_column: 0 size: 0 nullable: 1 data_type: INT64 },
 { id: 4 num_varlen_columns_before: 3 offset_after_prev_varlen_column: 0 size: 0 nullable: 1 data_type: BINARY },
 { id: 5 num_varlen_columns_before: 4 offset_after_prev_varlen_column: 0 size: 0 nullable: 1 data_type: INT64 },
 { id: 6 num_varlen_columns_before: 5 offset_after_prev_varlen_column: 0 size: 0 nullable: 1 data_type: BOOL },
 { id: 7 
.
.
.

Test details

1. Start the cross DB DDL workload which will execute DDLs and DMLs across databases concurrently (20 colocated database and 20 non-colocated database), run this for 20-30 mins
2. Schedule point-in-time recovery (PITR) for 10 random databases.
3. Create a backup for one random database.
4. Start and stop the workload after 10 minutes.
5. Note the time for the first PITR.
6. Keep the workload running.
7. Perform another PITR at a random time while the workload continues. ---> Issue occurs here
tserverGFlags: [
               {"name": "ysql_enable_packed_row", "value": "true"},
               {"name": "ysql_enable_packed_row_for_colocated_table", "value": "true"},
               {"name": "enable_automatic_tablet_splitting", "value": "true"}, 
               {"name": "ysql_max_connections", "value": "500"},
               {"name": "client_read_write_timeout_ms", "value": "1800000"},
               {"name": "yb_client_admin_operation_timeout_sec", "value": "1800"}, 
               {"name": "consistent_restore", "value": "true"},
               {"name": "ysql_enable_db_catalog_version_mode", "value": "true"}
],
master_gflags=[
               {"name": "ysql_enable_packed_row", "value": "true"},
               {"name": "ysql_enable_packed_row_for_colocated_table", "value": "true"},
               {"name": "enable_automatic_tablet_splitting", "value": "true"},
               {"name": "tablet_split_high_phase_shard_count_per_node", "value": 20000},
               {"name": "tablet_split_high_phase_size_threshold_bytes", "value": 2097152}, 
               {"name": "tablet_split_low_phase_size_threshold_bytes", "value": 102400},
               {"name": "tablet_split_low_phase_shard_count_per_node", "value": 10000},
               {"name": "consistent_restore", "value": "true"},
               {"name": "ysql_enable_db_catalog_version_mode", "value": "true"}, 
              {"name": "allowed_preview_flags_csv", "value": "ysql_enable_db_catalog_version_mode"}, 
]

List of DDLs executed in sample app

private static List<List<String>> ddlList = List.of(
            List.of("CREATE INDEX idx1 ON ? (k)", "DROP INDEX idx1"),
            List.of("CREATE TABLE tempTable1 AS SELECT * FROM ? limit 1000000", "ALTER TABLE tempTable1 RENAME TO tempTable1_new", "DROP TABLE tempTable1_new"),
            List.of("CREATE MATERIALIZED VIEW mv1 as SELECT k from ? limit 10000", "REFRESH MATERIALIZED VIEW mv1", "DROP MATERIALIZED VIEW mv1"),
            List.of("ALTER TABLE ? ADD newColumn1 TEXT DEFAULT 'dummyString'", "ALTER TABLE ? DROP newColumn1"),
            List.of("ALTER TABLE ? ADD newColumn2 TEXT NULL", "ALTER TABLE ? DROP newColumn2"),
            List.of("CREATE VIEW view1_? AS SELECT k from ?", "DROP VIEW view1_?"),
            List.of("ALTER TABLE ? ADD newColumn3 TEXT DEFAULT 'dummyString'", "ALTER TABLE ? ALTER newColumn3 TYPE VARCHAR(1000)", "ALTER TABLE ? DROP newColumn3"),
            List.of("CREATE TABLE tempTable2 AS SELECT * FROM ? limit 1000000", "CREATE INDEX idx2 ON tempTable2(k)", "ALTER TABLE ? ADD newColumn4 TEXT DEFAULT 'dummyString'", "ALTER TABLE tempTable2 ADD newColumn2 TEXT DEFAULT 'dummyString'", "TRUNCATE table ? cascade", "ALTER TABLE ? DROP newColumn4", "ALTER TABLE tempTable2 DROP newColumn2", "DROP INDEX idx2", "DROP TABLE tempTable2"),
            List.of("CREATE VIEW view2_? AS SELECT k from ?", "CREATE MATERIALIZED VIEW mv2 as SELECT k from ? limit 10000", "REFRESH MATERIALIZED VIEW mv2", "DROP MATERIALIZED VIEW mv2", "DROP VIEW view2_?")
 );

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

druzac commented 1 month ago

Issue seems to be racing DDL and PITR. Two schema bump requests are in flight.

In this run, the schema version mutation from the drop index DDL wanted to set the schema version to 10. The schema version mutation from the restore wanted to set the schema version to 9. The restore mutation was processed after the drop index mutation on the tservers, which left them in the state of having all schemas - 8, 9, and 10 - but with their current schema version set to 9. On a tablet report, master noticed the discrepancy - its own schema for the table was 10 - and it sent an alter table. Tservers printed this error message because they did have a schema for version 10, it just wasn't their current schema.

# restore touching tablet 3e220a7081da43f9acd970fa26ffd2ad. schema version for this table is initially 8.
I0401 18:27:06.619601 31368 master_snapshot_coordinator.cc:2095] Creating a new restoration entry with id ff4fe147-9cc9-48f7-9981-a4b88938db59
I0401 18:27:06.621199 33107 master_snapshot_coordinator.cc:584] Restore sys catalog from snapshot: { id: 0c6e334b-5944-41b4-afc4-fa48d1020058 snapshot_hybrid_time: { physical: 1711995401844906 } schedule_id: fdccc0ae-e3ce-4e1f-856a-8a23618beaea previous_snapshot_hybrid_time: { physical: 1711993631997185 } version: 3 initial_state: CREATING tablets: [{ id: 08fb93825a8e481a8d0eb7a175c28390 state: 2 last_error: OK running: 0 }, { id: ef25f698c9174f19a9063c253209280e state: 2 last_error: OK running: 0 }, { id: 2f6d373059fb4d40bbbd320e4853bcf8 state: 2 last_error: OK running: 0 }, { id: ab8dabb20ba64f15b83829ff4ce55bb9 state: 2 last_error: OK running: 0 }, { id: 864c71e225224603b6919eea175d9ff8 state: 2 last_error: OK running: 0 }, { id: 3e220a7081da43f9acd970fa26ffd2ad state: 2 last_error: OK running: 0 }, { id: 5b96194bb5fe4fdab4ac26fd0a7f3f6b state: 2 last_error: OK running: 0 }, { id: c3b47eeabcbf4aaaac6cb6475abbdd70 state: 2 last_error: OK running: 0 }, { id: 0c9212f0ba0645ce96bf4bfd3c1fd7fa state: 2 last_error: OK running: 0 }] }, schedule: { id: fdccc0ae-e3ce-4e1f-856a-8a23618beaea options: filter { tables { tables { table_name: "" namespace { id: "0000405a000030008000000000000000" name: "postgres_90" database_type: YQL_DATABASE_PGSQL } } } } interval_sec: 12000 retention_duration_sec: 60000 restoration_times: 7012333183402876928 } at { physical: 1711994529957599 }, op id: 1.665832
I0401 18:27:08.060552 33107 master_snapshot_coordinator.cc:1205] PITR: ff4fe147-9cc9-48f7-9981-a4b88938db59, tablets to restore: [c3b47eeabcbf4aaaac6cb6475abbdd70, ab8dabb20ba64f15b83829ff4ce55bb9, 3e220a7081da43f9acd970fa26ffd2ad, 0c9212f0ba0645ce96bf4bfd3c1fd7fa, ef25f698c9174f19a9063c253209280e, 864c71e225224603b6919eea175d9ff8, 2f6d373059fb4d40bbbd320e4853bcf8, 08fb93825a8e481a8d0eb7a175c28390, 5b96194bb5fe4fdab4ac26fd0a7f3f6b]

# this updates schema version to 9 on master.
I0401 18:27:08.325865 177726 restore_sys_catalog_state.cc:409] PITR: Patching the schema version for table 0000405a00003000800000000000426c. Existing version 8, restoring version 7

# this drop index DDL bumps the schema version on the master leader. so it is now 10.
I0401 18:27:10.307303 350355 catalog_manager.cc:6077] MarkIndexInfoFromTableForDeletion table 0000405a00003000800000000000426c index 0000405a00003000800000000000428f multi_stage=0
I0401 18:27:10.307324 350355 catalog_manager.cc:6112] DeleteIndexInfoFromTable table 0000405a00003000800000000000426c index 0000405a00003000800000000000428f
I0401 18:27:10.309399 350361 async_rpc_tasks.cc:201] Alter Table RPC for tablet 0x00002467659c0600 -> ab8dabb20ba64f15b83829ff4ce55bb9 (table tb_0 [id=0000405a00003000800000000000426c]) (tb_0 [id=0000405a00003000800000000000426c]) (task=0x0000246756f64558, state=kWaiting): Unable to pick replica: Not found (yb/master/catalog_entity_info.cc:222): No leader found for tablet ab8dabb20ba64f15b83829ff4ce55bb9 (table tb_0 [id=0000405a00003000800000000000426c]) with 0 replicas: [].

# this alter schema just bumps version from 8 to 10. successful, no version collision
I0401 18:27:10.311276 266648 tablet.cc:2350] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: Alter schema from Schema... version 8 to ... version 10
I0401 18:27:10.311378 266648 doc_read_context.cc:48] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: DocReadContext, copy and add: [0, 1, 2, 3, 4, 5, 6, 7, 8, 10], added: 10

# tablet leader starts restoring 3e220a7081da43f9acd970fa26ffd2ad. during this tablet restore, schema version is set to 9 as that is set in the tablet restore operation.
I0401 18:27:10.362481 43986 backup_service.cc:70] Processing TabletSnapshotOp for tablet 3e220a7081da43f9acd970fa26ffd2ad from 172.151.22.165:35403: RESTORE_ON_TABLET
I0401 18:27:10.365592 204524 version_set.cc:2966] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: Recovered from manifest file:/mnt/d0/yb-data/tserver/data/rocksdb/table-0000405a00003000800000000000426c/tablet-3e220a7081da43f9acd970fa26ffd2ad/MANIFEST-000011 succeeded,manifest_file_number is 11, next_file_number is 13, last_sequence is 1125899906842809, log_number is 0,prev_log_number is 0,max_column_family is 0, flushed_values is 0x000006871c9ece40 -> Global filter: <invalid>, Cotables filter: []{ op_id: 1.112 hybrid_time: { physical: 1711995269632341 } history_cutoff: { cotables cutoff: <invalid>, primary cutoff: <invalid> } max_value_level_ttl_expiration_time: <invalid> primary_schema_version: 8 cotable_schema_versions: [] }
I0401 18:27:10.374156 204524 doc_read_context.cc:48] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: DocReadContext, copy and add: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], added: 9
I0401 18:27:10.374197 204524 tablet_snapshots.cc:441] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: Merging metadata with restored: /mnt/d0/yb-data/tserver/data/rocksdb/table-0000405a00003000800000000000426c/tablet-3e220a7081da43f9acd970fa26ffd2ad.snapshots/0c6e334b-5944-41b4-afc4-fa48d1020058/tablet.metadata , force overwrite of schema packing 0
I0401 18:27:10.374764 204524 doc_read_context.cc:78] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: LogAfterMerge: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], overwrite: 0
I0401 18:27:10.391292 204524 tablet_snapshots.cc:464] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: Checkpoint restored from /mnt/d0/yb-data/tserver/data/rocksdb/table-0000405a00003000800000000000426c/tablet-3e220a7081da43f9acd970fa26ffd2ad.snapshots/0c6e334b-5944-41b4-afc4-fa48d1020058
I0401 18:27:10.391300 204524 tablet_snapshots.cc:465] T 3e220a7081da43f9acd970fa26ffd2ad P 7e53919270d141569de17c96ece87561: Re-enabling compactions
I0401 18:27:10.409065 33000 version_set.cc:2966] T 3e220a7081da43f9acd970fa26ffd2ad P fc707a69eb1f48339771e125e821d7ec: Recovered from manifest file:/mnt/d0/yb-data/tserver/data/rocksdb/table-0000405a00003000800000000000426c/tablet-3e220a7081da43f9acd970fa26ffd2ad/MANIFEST-000011 succeeded,manifest_file_number is 11, next_file_number is 13, last_sequence is 1125899906842809, log_number is 0,prev_log_number is 0,max_column_family is 0, flushed_values is 0x000014f87913b680 -> Global filter: <invalid>, Cotables filter: []{ op_id: 1.112 hybrid_time: { physical: 1711995269632341 } history_cutoff: { cotables cutoff: <invalid>, primary cutoff: <invalid> } max_value_level_ttl_expiration_time: <invalid> primary_schema_version: 8 cotable_schema_versions: [] }
I0401 18:27:10.420894 33000 doc_read_context.cc:48] T 3e220a7081da43f9acd970fa26ffd2ad P fc707a69eb1f48339771e125e821d7ec: DocReadContext, copy and add: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], added: 9
I0401 18:27:10.428366 33000 version_set.cc:2966] T 3e220a7081da43f9acd970fa26ffd2ad P fc707a69eb1f48339771e125e821d7ec [R]: Recovered from manifest file:/mnt/d0/yb-data/tserver/data/rocksdb/table-0000405a00003000800000000000426c/tablet-3e220a7081da43f9acd970fa26ffd2ad/MANIFEST-000013 succeeded,manifest_file_number is 13, next_file_number is 15, last_sequence is 1125899906842809, log_number is 0,prev_log_number is 0,max_column_family is 0, flushed_values is 0x000014f88f597b00 -> Global filter: <invalid>, Cotables filter: []{ op_id: 1.118 hybrid_time: { physical: 1711996030362598 } history_cutoff: { cotables cutoff: <invalid>, primary cutoff: <invalid> } max_value_level_ttl_expiration_time: <invalid> primary_schema_version: <nullopt> cotable_schema_versions: [] }
I0401 18:27:10.442688 33000 tablet_snapshots.cc:464] T 3e220a7081da43f9acd970fa26ffd2ad P fc707a69eb1f48339771e125e821d7ec: Checkpoint restored from /mnt/d0/yb-data/tserver/data/rocksdb/table-0000405a00003000800000000000426c/tablet-3e220a7081da43f9acd970fa26ffd2ad.snapshots/0c6e334b-5944-41b4-afc4-fa48d1020058
I0401 18:27:10.442695 33000 tablet_snapshots.cc:465] T 3e220a7081da43f9acd970fa26ffd2ad P fc707a69eb1f48339771e125e821d7ec: Re-enabling compactions

# restore finishes
I0401 18:27:10.406605 350359 state_with_tablets.cc:161] Restoration[ff4fe147-9cc9-48f7-9981-a4b88938db59/0c6e334b-5944-41b4-afc4-fa48d1020058/fdccc0ae-e3ce-4e1f-856a-8a23618beaea]: Finished RESTORING at 3e220a7081da43f9acd970fa26ffd2ad, 9 was running
I0401 18:27:10.406663 350359 catalog_manager_ext.cc:3109] Handling Restore Tablet Snapshot Response for tablet 3e220a7081da43f9acd970fa26ffd2ad (table tb_0 [id=0000405a00003000800000000000426c])  OK

# later the tservers send full tablet reports to master.
I0401 18:28:09.660087 31997 heartbeater.cc:411] P 7e53919270d141569de17c96ece87561: Sending a full tablet report to master...
# version is 9 instead of master's 10, so master sends an alterschema request
I0401 18:28:09.899121 33053 catalog_manager.cc:8242] TS 7e53919270d141569de17c96ece87561 does not have the latest schema for tablet 3e220a7081da43f9acd970fa26ffd2ad (table tb_0 [id=0000405a00003000800000000000426c]). Expected version 10 got 9
# alter schema request tells tservers to add schema version 10, which they already have.
E0401 18:28:09.903576 266648 schema_packing.cc:423] Duplicate schema version: 10,