yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.89k stars 1.06k forks source link

[YSQL][SQLancer] Colocation - Table not found in Raft group exception occurs during testing #11357

Closed qvad closed 1 year ago

qvad commented 2 years ago

Jira Link: DB-1233

Description

During SQLancer tests I've got following FATAL in tserver logs with following message. Couldn't determine SQL scenario for that, looks like it requires concurrent ops.

FATAL log

F0204 05:11:38.387398 31455 tablet.cc:3653] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (00004df0000030008000000000004dfa) not found in Raft group 0e273cad399e4446a9712935401122e1

Cut from INFO log.

102342:I0209 23:33:47.560289  8712 tablet_bootstrap.cc:1172] T 791277af61154ed4929c319eb2b1e203 P 12a83ca6c58b432294303929b83b03f6: 
Segment /mnt/d0/yb-data/tserver/wals/table-0000434c000030008000000000000000.colocated.parent.uuid/tablet-791277af61154ed4929c319eb2b1e203/wal-000000001 cannot be used as the first segment to start replay with according to our OpId and retention criteria. 
However, this is already the earliest segment so we have to start replay here. 
We should probably investigate how we got into this situation. op_id=1.1, first_op_time=3236.565s, min_seconds_to_retain_logs=120.000s, *replay_from_this_or_earlier_time=3116.565s, is_first_op_id_low_enough=0, is_first_op_time_early_enough=0

Log files. INFO.log.zip FATAL.log.zip

mbautin commented 2 years ago

op_id=1.1 indicates that the tablet_bootstrap.cc error message is just a false positive.

jasonyb commented 2 years ago

@qvad sent full logs to me (but it looks like it's not exactly the same as shown in the original comment).

Looked for "not found in Raft group" and found

master:

Table <unknown_table_name> (000040050000300080000000000004eb) not found in Raft group 00000000000000000000000000000000

tserver:

Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203

Since the master one isn't fatal, I ignored it. To investigate the tserver one, I looked for that table id "0000434c0000300080000000000043da". Tserver logs only show that same message, but master logs show

table_id: "0000434c0000300080000000000043da"
I0209 22:14:54.393147 19957 catalog_manager.cc:3401] Successfully created table t0 [id=0000434c0000300080000000000043da] in database10 [id=0000434c000030008000000000000000] per request from 172.151.25.18:34415
indexed_table_id: "0000434c0000300080000000000043da"
  indexed_table_id: "0000434c0000300080000000000043da"
I0209 22:14:54.571432 19924 catalog_manager.cc:3184] CreateTable with IndexInfo is_unique: true indexed_table_id: "0000434c0000300080000000000043da"
I0209 22:14:54.576187 19924 catalog_manager.cc:2124] AddIndexInfoToTable to t0 [id=0000434c0000300080000000000043da]  IndexInfo table_id: "0000434c0000300080000000000043df" is_unique: true indexed_table_id: "0000434c0000300080000000000043da"
I0209 22:14:54.585793 25263 backfill_index.cc:253] t0 [id=0000434c0000300080000000000043da] - Alter table completed version=1, state: RUNNING
I0209 22:14:54.796897 25264 catalog_manager.cc:3469] T 00000000000000000000000000000000 P 6447bd0d8fa147c89591d9d4030acd59: Table transaction succeeded: t0 [id=0000434c0000300080000000000043da]

Before looking at tserver logs, master tells tserver uuid to host ip associations:

I0209 22:05:23.076793 19953 ts_manager.cc:140] Registered new tablet server { permanent_uuid: "39d07ea3c84f4661bbea6c4eea007063" instance_seqno: 1644440425919157 start_time_us: 1644440425919157 } with Master, full list: [{39d07ea3c84f4661bbea6c4eea007063, 0x00000000081be018 -> { permanent_uuid: 39d07ea3c84f4661bbea6c4eea007063 registration: common { private_rpc_addresses { host: "172.151.25.18" port: 9100 } http_addresses { host: "172.151.25.18" port: 9000 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" pg_port: 5433 } capabilities: 2189743739 capabilities: 1427296937 capabilities: 2980225056 placement_id: aws:us-west-2:us-west-2a }}]
...
I0209 22:05:23.096498 19953 ts_manager.cc:140] Registered new tablet server { permanent_uuid: "12a83ca6c58b432294303929b83b03f6" instance_seqno: 0 } with Master, full list: [{12a83ca6c58b432294303929b83b03f6, 0x00000000081bed38 -> { permanent_uuid: 12a83ca6c58b432294303929b83b03f6 registration: common { private_rpc_addresses { host: "172.151.52.24" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2c" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" } placement_id: aws:us-west-2:us-west-2c }}, {39d07ea3c84f4661bbea6c4eea007063, 0x00000000081be018 -> { permanent_uuid: 39d07ea3c84f4661bbea6c4eea007063 registration: common { private_rpc_addresses { host: "172.151.25.18" port: 9100 } http_addresses { host: "172.151.25.18" port: 9000 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" pg_port: 5433 } capabilities: 2189743739 capabilities: 1427296937 capabilities: 2980225056 placement_id: aws:us-west-2:us-west-2a }}]
...
I0209 22:05:23.096557 19953 ts_manager.cc:140] Registered new tablet server { permanent_uuid: "f6f4dd6f6fc94534ac0efc8ea48e781f" instance_seqno: 0 } with Master, full list: [{f6f4dd6f6fc94534ac0efc8ea48e781f, 0x00000000081beb58 -> { permanent_uuid: f6f4dd6f6fc94534ac0efc8ea48e781f registration: common { private_rpc_addresses { host: "172.151.37.182" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2b" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" } placement_id: aws:us-west-2:us-west-2b }}, {12a83ca6c58b432294303929b83b03f6, 0x00000000081bed38 -> { permanent_uuid: 12a83ca6c58b432294303929b83b03f6 registration: common { private_rpc_addresses { host: "172.151.52.24" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2c" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" } placement_id: aws:us-west-2:us-west-2c }}, {39d07ea3c84f4661bbea6c4eea007063, 0x00000000081be018 -> { permanent_uuid: 39d07ea3c84f4661bbea6c4eea007063 registration: common { private_rpc_addresses { host: "172.151.25.18" port: 9100 } http_addresses { host: "172.151.25.18" port: 9000 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" pg_port: 5433 } capabilities: 2189743739 capabilities: 1427296937 capabilities: 2980225056 placement_id: aws:us-west-2:us-west-2a }}]
...
I0209 22:05:23.182243 19926 ts_manager.cc:140] Registered new tablet server { permanent_uuid: "045116cf294e40ffa4591045e494b2cc" instance_seqno: 1644440425547333 start_time_us: 1644440425547333 } with Master, full list: [{045116cf294e40ffa4591045e494b2cc, 0x000000000c0ab4b8 -> { permanent_uuid: 045116cf294e40ffa4591045e494b2cc registration: common { private_rpc_addresses { host: "172.151.22.253" port: 9100 } http_addresses { host: "172.151.22.253" port: 9000 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" pg_port: 5433 } capabilities: 2189743739 capabilities: 1427296937 capabilities: 2980225056 placement_id: aws:us-west-2:us-west-2a }}, {f6f4dd6f6fc94534ac0efc8ea48e781f, 0x00000000081beb58 -> { permanent_uuid: f6f4dd6f6fc94534ac0efc8ea48e781f registration: common { private_rpc_addresses { host: "172.151.37.182" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2b" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" } placement_id: aws:us-west-2:us-west-2b }}, {12a83ca6c58b432294303929b83b03f6, 0x00000000081bed38 -> { permanent_uuid: 12a83ca6c58b432294303929b83b03f6 registration: common { private_rpc_addresses { host: "172.151.52.24" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2c" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" } placement_id: aws:us-west-2:us-west-2c }}, {39d07ea3c84f4661bbea6c4eea007063, 0x00000000081be018 -> { permanent_uuid: 39d07ea3c84f4661bbea6c4eea007063 registration: common { private_rpc_addresses { host: "172.151.25.18" port: 9100 } http_addresses { host: "172.151.25.18" port: 9000 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" pg_port: 5433 } capabilities: 2189743739 capabilities: 1427296937 capabilities: 2980225056 placement_id: aws:us-west-2:us-west-2a }}]

simplified:

"39d07ea3c84f4661bbea6c4eea007063": "172.151.25.18"
"12a83ca6c58b432294303929b83b03f6": "172.151.52.24"
"f6f4dd6f6fc94534ac0efc8ea48e781f": "172.151.37.182"
"045116cf294e40ffa4591045e494b2cc": "172.151.22.253"

Looking at tserver logs around the create time (22:14:54):

172.151.25.18 (leader):

I0209 22:14:48.217617 19991 reactor.cc:466] TabletServer_R006: DEBUG: Closing idle connection: Connection (0x000000000d1726d8) server 172.151.22.253:40258 => 172.151.25.18:9100 - it has been idle for 65.0019s
W0209 22:14:48.236093 20072 consensus_peers.cc:543] T 1259966f5e86463b99fc74e0fc00a36c P 39d07ea3c84f4661bbea6c4eea007063 -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Timed out (yb/rpc/outbound_call.cc:493): UpdateConsensus RPC (request call id 830321) to 172.151.52.24:9100 timed out after 3.000s. Retrying in the next heartbeat period. Already tried 7 times. State: 2
I0209 22:14:48.309013 20063 replica_state.cc:1292] T 30fb45c84bca48648ee90f909df71064 P 39d07ea3c84f4661bbea6c4eea007063 [term 9 LEADER]: Revoked old leader 12a83ca6c58b432294303929b83b03f6 ht lease: { physical: 1644444866909866 }
I0209 22:14:52.533170 20062 raft_consensus.cc:2825] T e9bc116ed92f44e2b98f2a99348e07d2 P 39d07ea3c84f4661bbea6c4eea007063 [term 3 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 4 because replica is either leader or believes a valid leader to be alive. Time left: 9223367330.543s
I0209 22:14:52.533277 20058 raft_consensus.cc:2825] T 8b6c593f71564ac68f5c4a1f8d19e8ce P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367330.543s
I0209 22:14:52.755475 20074 raft_consensus.cc:2825] T 6bc187957ea445e5b159fa7d32997f51 P 39d07ea3c84f4661bbea6c4eea007063 [term 1 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 9223367330.321s
I0209 22:14:52.755771 19227 raft_consensus.cc:2825] T a2c8bb4a8ae44385ade6a89a67cc7eae P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367330.321s
I0209 22:14:52.756230 19555 raft_consensus.cc:2825] T e9454cb5229d4a59a72d62cc393f0e38 P 39d07ea3c84f4661bbea6c4eea007063 [term 5 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 6 because replica is either leader or believes a valid leader to be alive. Time left: 2.788s
I0209 22:14:52.757354 20062 consensus_queue.cc:1181] T 449665936ee6477eb178fd79e41da201 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Connected to new peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.563 next_index: 564 last_known_committed_idx: 563 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.563 }
W0209 22:14:53.281682 20059 consensus_peers.cc:543] T 851c6ede57444e82887ba759e42f26ba P 39d07ea3c84f4661bbea6c4eea007063 -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Timed out (yb/rpc/outbound_call.cc:493): UpdateConsensus RPC (request call id 831453) to 172.151.52.24:9100 timed out after 3.000s. Retrying in the next heartbeat period. Already tried 9 times. State: 2
I0209 22:14:54.227458 20075 raft_consensus.cc:2825] T ba65d1fb305640e5aa40e45cb228da41 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367328.849s
I0209 22:14:54.386442 20025 pg_create_table.cc:71] PgCreateTable: creating a transactional table: database10.t0/{ database_oid: 17228 object_oid: 17370 }
I0209 22:14:54.450407 20025 table_creator.cc:337] Created table database10.t0 of type PGSQL_TABLE_TYPE
I0209 22:14:54.569795 20008 pg_create_table.cc:71] PgCreateTable: creating a transactional index: database10.t0_c1_key/{ database_oid: 17228 object_oid: 17375 }
I0209 22:14:54.580808 20065 tablet.cc:1831] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Alter schema from Schema [
        0:c1[float NOT NULL NOT A PARTITION KEY],
        1:c0[string NOT NULL NOT A PARTITION KEY]
]
properties: contain_counters: false is_transactional: true consistency_level: STRONG use_mangled_column_name: false is_ysql_catalog_table: false retain_delete_markers: false
pgtable_id: 17370 version 0 to Schema [
        0:c1[float NOT NULL NOT A PARTITION KEY],
        1:c0[string NOT NULL NOT A PARTITION KEY]
]
properties: contain_counters: false is_transactional: true consistency_level: STRONG use_mangled_column_name: false is_ysql_catalog_table: false retain_delete_markers: false version 1
I0209 22:14:54.580852 20065 thread_pool.cc:170] Starting thread pool { name: TransactionManager queue_limit: 500 max_workers: 50 }
I0209 22:14:54.581032 20065 meta_data_cache.cc:77] Creating a metadata cache without a permissions cache
I0209 22:14:54.607905 20008 table_creator.cc:337] Created index database10.t0_c1_key of type PGSQL_TABLE_TYPE
I0209 22:14:56.583252 20080 raft_consensus.cc:2825] T 4832270ae48747a2a2fa398815e0e017 P 39d07ea3c84f4661bbea6c4eea007063 [term 1 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 2.818s
I0209 22:14:57.035742 19226 raft_consensus.cc:2825] T b69e470c5f6c4424ba1f289fa0f24144 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367326.041s
I0209 22:14:57.154297 20059 raft_consensus.cc:2825] T 5f8d6665077f4d7d8020cd0e0c0aa908 P 39d07ea3c84f4661bbea6c4eea007063 [term 1 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 2.593s
I0209 22:14:57.593768 20076 raft_consensus.cc:2825] T 109429407e87471483a8b41d4c5d6108 P 39d07ea3c84f4661bbea6c4eea007063 [term 1 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 2.664s
W0209 22:14:58.354141 18656 consensus_peers.cc:543] T 575eb5ec773f4c9d80b6f41737ce02a4 P 39d07ea3c84f4661bbea6c4eea007063 -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Illegal state (yb/tserver/service_util.cc:252): Tablet not RUNNING: BOOTSTRAPPING (tablet server error 12) (raft group state error 0). Retrying in the next heartbeat period. Already tried 29 times. State: 2
I0209 22:14:59.752295 19225 raft_consensus.cc:2825] T ea5c96cd671b495e90336cd7a8ace68f P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367323.324s
I0209 22:15:01.522599 20008 pg_create_table.cc:71] PgCreateTable: creating a transactional table: database10.t1/{ database_oid: 17228 object_oid: 17377 }
I0209 22:15:01.585842 20008 table_creator.cc:337] Created table database10.t1 of type PGSQL_TABLE_TYPE
I0209 22:15:02.795289 20072 consensus_queue.cc:1183] T c353c676cd5847a4a0ed8b45d1f1c6d3 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.315 next_index: 316 last_known_committed_idx: 315 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.315 }
I0209 22:15:02.795408 20072 log_cache.cc:348] T c353c676cd5847a4a0ed8b45d1f1c6d3 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 24 ops from disk.

172.151.52.24 (peer that FATALs):

W0209 22:14:54.003250 24445 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:45471 => 172.151.52.24:9100 (request call id 835582) took 3001ms (client timeout 3000ms).
W0209 22:14:54.004848 23615 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 835583) took 3001ms (client timeout 3000ms).
W0209 22:14:52.904974 20073 long_operation_tracker.cc:114] UpdateReplica running for 28.442s in thread 23348:
    @     0x7f8dc3ff4b39  __lll_lock_wait
    @     0x7f8dc3fef6e2  __GI___pthread_mutex_lock
    @          0x3763a7c  yb::ThreadPool::DoSubmit()
    @          0x376532a  yb::ThreadPoolToken::SubmitFunc()
    @          0x348f5bd  yb::tablet::Preparer::Submit()
    @          0x34869f5  yb::tablet::OperationDriver::ExecuteAsync()
    @          0x34f1932  yb::tablet::TabletPeer::SetPropagatedSafeTime()
    @          0x3041fc6  yb::consensus::RaftConsensus::UpdateReplica()
    @          0x302a7f7  yb::consensus::RaftConsensus::Update()
    @          0x361ec76  yb::tserver::ConsensusServiceImpl::UpdateConsensus()
    @          0x30845ce  std::__1::__function::__func<>::operator()()
    @          0x3085135  yb::consensus::ConsensusServiceIf::Handle()
    @          0x341faea  yb::rpc::ServicePoolImpl::Handle()
    @          0x33dfc34  yb::rpc::InboundCall::InboundCallTask::Run()
    @          0x342d794  yb::rpc::(anonymous namespace)::Worker::Execute()
    @          0x375c19a  yb::Thread::SuperviseThread()
W0209 22:14:54.014756 24324 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:39100 => 172.151.52.24:9100 (request call id 834923) took 3001ms (client timeout 3000ms).
W0209 22:14:54.016106 27209 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:46422 => 172.151.52.24:9100 (request call id 835587) took 3001ms (client timeout 3000ms).
W0209 22:14:54.028255 22307 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:40620 => 172.151.52.24:9100 (request call id 831636) took 3001ms (client timeout 3000ms).
W0209 22:14:54.048902 21352 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:46690 => 172.151.52.24:9100 (request call id 834931) took 3000ms (client timeout 3000ms).
W0209 22:14:54.109333 21925 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:42895 => 172.151.52.24:9100 (request call id 831655) took 3001ms (client timeout 3000ms).
W0209 22:14:54.110365 21646 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:38586 => 172.151.52.24:9100 (request call id 835606) took 3001ms (client timeout 3000ms).
W0209 22:14:54.121662 22144 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 835611) took 3001ms (client timeout 3000ms).
W0209 22:14:54.128407 24660 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:46422 => 172.151.52.24:9100 (request call id 835614) took 3001ms (client timeout 3000ms).
W0209 22:14:54.131925 23249 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:46217 => 172.151.52.24:9100 (request call id 831664) took 3000ms (client timeout 3000ms).
W0209 22:14:54.145328 23436 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:35476 => 172.151.52.24:9100 (request call id 835619) took 3001ms (client timeout 3000ms).
W0209 22:14:54.154476 24648 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:36655 => 172.151.52.24:9100 (request call id 834952) took 3001ms (client timeout 3000ms).
W0209 22:14:54.158154 23270 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:39100 => 172.151.52.24:9100 (request call id 834954) took 3001ms (client timeout 3000ms).
W0209 22:14:54.166102 22223 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:46690 => 172.151.52.24:9100 (request call id 834958) took 3000ms (client timeout 3000ms).
W0209 22:14:54.214113 21243 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:41700 => 172.151.52.24:9100 (request call id 834973) took 3001ms (client timeout 3000ms).
W0209 22:14:54.219177 23286 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:39971 => 172.151.52.24:9100 (request call id 831678) took 3001ms (client timeout 3000ms).
W0209 22:14:54.226625 19980 thread.cc:724] consensus [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.562s    user 0.000s sys 0.010s
W0209 22:14:54.226678 19980 thread.cc:689] consensus [worker] (thread pool) Time spent starting thread: real 0.562s  user 0.000s sys 0.010s
I0209 22:14:54.226716 27263 raft_consensus.cc:1015] T 883ccaddfc624dbd8ac9d799b9f911dc P 12a83ca6c58b432294303929b83b03f6 [term 1 FOLLOWER]: ReportFailDetected: Starting NORMAL_ELECTION...
I0209 22:14:54.226722 27146 raft_consensus.cc:1015] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [term 2 FOLLOWER]: ReportFailDetected: Starting NORMAL_ELECTION...
I0209 22:14:54.226765 27146 raft_consensus.cc:615] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [term 2 FOLLOWER]: Fail of leader 39d07ea3c84f4661bbea6c4eea007063 detected. Triggering leader pre-election, mode=NORMAL_ELECTION
I0209 22:14:54.226776 27146 raft_consensus.cc:3240] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [term 2 FOLLOWER]: Snoozing leader timeout detection for 4.690s
I0209 22:14:54.226794 27146 raft_consensus.cc:659] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [term 2 FOLLOWER]: Starting pre-election with config: opid_index: -1 peers { permanent_uuid: "39d07ea3c84f4661bbea6c4eea007063" member_type: VOTER last_known_private_addr { host: "172.151.25.18" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2>
I0209 22:14:54.226852 27146 leader_election.cc:216] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 3 pre-election: Requesting vote from peer 39d07ea3c84f4661bbea6c4eea007063
I0209 22:14:54.226876 27146 leader_election.cc:216] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 3 pre-election: Requesting vote from peer f6f4dd6f6fc94534ac0efc8ea48e781f
I0209 22:14:54.227316 24056 leader_election.cc:381] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 3 pre-election: Vote denied by peer f6f4dd6f6fc94534ac0efc8ea48e781f. Message: Invalid argument (yb/consensus/raft_consensus.cc:2826): T ba65d1fb305640e5aa40e45cb228da41 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candid>
I0209 22:14:54.228027 24679 leader_election.cc:381] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 3 pre-election: Vote denied by peer 39d07ea3c84f4661bbea6c4eea007063. Message: Invalid argument (yb/consensus/raft_consensus.cc:2826): T ba65d1fb305640e5aa40e45cb228da41 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidat>
I0209 22:14:54.228044 24679 leader_election.cc:242] T ba65d1fb305640e5aa40e45cb228da41 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 3 pre-election: Election decided. Result: candidate lost.
W0209 22:14:54.254370 27216 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:40620 => 172.151.52.24:9100 (request call id 831685) took 3001ms (client timeout 3000ms).
W0209 22:14:54.272616 22849 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:36655 => 172.151.52.24:9100 (request call id 834984) took 3001ms (client timeout 3000ms).
W0209 22:14:54.286763 24246 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:42074 => 172.151.52.24:9100 (request call id 834990) took 3000ms (client timeout 3000ms).
W0209 22:14:54.309036 22783 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:41429 => 172.151.52.24:9100 (request call id 835654) took 3001ms (client timeout 3000ms).
W0209 22:14:54.325409 21977 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:35476 => 172.151.52.24:9100 (request call id 835657) took 3001ms (client timeout 3000ms).
W0209 22:14:54.346801 22750 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 835658) took 3001ms (client timeout 3000ms).
W0209 22:14:54.371865 22869 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:44818 => 172.151.52.24:9100 (request call id 835005) took 3000ms (client timeout 3000ms).
W0209 22:14:54.373895 21659 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:39988 => 172.151.52.24:9100 (request call id 835661) took 3000ms (client timeout 3000ms).
W0209 22:14:54.391957 22955 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 835670) took 3001ms (client timeout 3000ms).
W0209 22:14:54.392990 22656 thread.cc:724] prepare [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.559s  user 0.008s sys 0.013s
W0209 22:14:54.393044 22656 thread.cc:689] prepare [worker] (thread pool) Time spent starting thread: real 0.559s    user 0.008s sys 0.013s
W0209 22:14:54.393061 22656 long_operation_tracker.cc:143] UpdateReplica took a long time: 29.822s
W0209 22:14:54.393080 22656 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 830292) took 29821ms (client timeout 3000ms).
W0209 22:14:54.394973 22093 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:40620 => 172.151.52.24:9100 (request call id 831717) took 3001ms (client timeout 3000ms).
W0209 22:14:54.414122 22630 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:42074 => 172.151.52.24:9100 (request call id 835012) took 3001ms (client timeout 3000ms).
W0209 22:14:54.415622 27188 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:41429 => 172.151.52.24:9100 (request call id 835677) took 3001ms (client timeout 3000ms).
W0209 22:14:54.418732 21634 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:46690 => 172.151.52.24:9100 (request call id 835014) took 3000ms (client timeout 3000ms).
W0209 22:14:54.451280 26277 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:41700 => 172.151.52.24:9100 (request call id 835019) took 3001ms (client timeout 3000ms).
W0209 22:14:54.487380 24827 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:45471 => 172.151.52.24:9100 (request call id 835692) took 3001ms (client timeout 3000ms).
W0209 22:14:54.507808 23395 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 835696) took 3001ms (client timeout 3000ms).
W0209 22:14:54.542686 23073 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:35476 => 172.151.52.24:9100 (request call id 835703) took 3001ms (client timeout 3000ms).
W0209 22:14:54.550653 24084 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 835704) took 3001ms (client timeout 3000ms).
W0209 22:14:54.706116 22826 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:44453 => 172.151.52.24:9100 (request call id 831787) took 3000ms (client timeout 3000ms).
W0209 22:14:54.706672 26168 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:35476 => 172.151.52.24:9100 (request call id 835743) took 3001ms (client timeout 3000ms).
W0209 22:14:54.726809 21442 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:43913 => 172.151.52.24:9100 (request call id 831792) took 3001ms (client timeout 3000ms).
W0209 22:14:54.730549 22205 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:40630 => 172.151.52.24:9100 (request call id 835083) took 3001ms (client timeout 3000ms).
W0209 22:14:54.778400 19980 thread.cc:724] consensus [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.552s    user 0.001s sys 0.009s
W0209 22:14:54.778445 19980 thread.cc:689] consensus [worker] (thread pool) Time spent starting thread: real 0.552s  user 0.001s sys 0.009s
W0209 22:14:54.786382 21723 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:36655 => 172.151.52.24:9100 (request call id 835092) took 3001ms (client timeout 3000ms).
W0209 22:14:54.792790 21695 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:39100 => 172.151.52.24:9100 (request call id 835095) took 3000ms (client timeout 3000ms).
W0209 22:14:54.015985 20073 long_operation_tracker.cc:114] UpdateReplica running for 29.445s in thread 22656:
    @     0x7f8dc44beffc  __GI_nanosleep
    @          0x375b5bb  yb::Thread::StartThread()
    @          0x375e1de  yb::ThreadPool::CreateThreadUnlocked()
    @          0x3763b56  yb::ThreadPool::DoSubmit()
    @          0x376532a  yb::ThreadPoolToken::SubmitFunc()
    @          0x348f5bd  yb::tablet::Preparer::Submit()
    @          0x34869f5  yb::tablet::OperationDriver::ExecuteAsync()
    @          0x34f1932  yb::tablet::TabletPeer::SetPropagatedSafeTime()
    @          0x3041fc6  yb::consensus::RaftConsensus::UpdateReplica()
    @          0x302a7f7  yb::consensus::RaftConsensus::Update()
    @          0x361ec76  yb::tserver::ConsensusServiceImpl::UpdateConsensus()
    @          0x30845ce  std::__1::__function::__func<>::operator()()
    @          0x3085135  yb::consensus::ConsensusServiceIf::Handle()
    @          0x341faea  yb::rpc::ServicePoolImpl::Handle()
    @          0x33dfc34  yb::rpc::InboundCall::InboundCallTask::Run()
    @          0x342d794  yb::rpc::(anonymous namespace)::Worker::Execute()
W0209 22:14:54.879803 21338 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 835778) took 3001ms (client timeout 3000ms).
W0209 22:14:54.930444 22343 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:40620 => 172.151.52.24:9100 (request call id 831827) took 3001ms (client timeout 3000ms).
W0209 22:14:54.951898 23219 thread.cc:724] prepare [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.559s  user 0.002s sys 0.019s
W0209 22:14:54.951956 23219 thread.cc:689] prepare [worker] (thread pool) Time spent starting thread: real 0.559s    user 0.002s sys 0.019s
W0209 22:14:54.951973 23219 long_operation_tracker.cc:143] UpdateReplica took a long time: 30.371s
W0209 22:14:54.951992 23219 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:41429 => 172.151.52.24:9100 (request call id 830294) took 30370ms (client timeout 3000ms).
W0209 22:14:55.004535 23603 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:44453 => 172.151.52.24:9100 (request call id 831851) took 3001ms (client timeout 3000ms).
W0209 22:14:55.100965 23100 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:42895 => 172.151.52.24:9100 (request call id 831874) took 3000ms (client timeout 3000ms).
W0209 22:14:55.169600 24337 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:39971 => 172.151.52.24:9100 (request call id 831883) took 3001ms (client timeout 3000ms).
W0209 22:14:55.189536 21548 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:38586 => 172.151.52.24:9100 (request call id 835840) took 3001ms (client timeout 3000ms).
W0209 22:14:55.202695 24693 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:41700 => 172.151.52.24:9100 (request call id 835175) took 3001ms (client timeout 3000ms).
W0209 22:14:55.205461 21408 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:45471 => 172.151.52.24:9100 (request call id 835842) took 3001ms (client timeout 3000ms).
W0209 22:14:55.214540 27268 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 835847) took 3001ms (client timeout 3000ms).
W0209 22:14:55.310096 23546 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:42895 => 172.151.52.24:9100 (request call id 831915) took 3001ms (client timeout 3000ms).
W0209 22:14:55.319890 19980 thread.cc:724] consensus [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.541s    user 0.001s sys 0.010s
W0209 22:14:55.319923 19980 thread.cc:689] consensus [worker] (thread pool) Time spent starting thread: real 0.541s  user 0.001s sys 0.010s
W0209 22:14:55.402393 21434 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:44766 => 172.151.52.24:9100 (request call id 831933) took 3001ms (client timeout 3000ms).
W0209 22:14:55.459204 24714 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:41429 => 172.151.52.24:9100 (request call id 835898) took 3001ms (client timeout 3000ms).
W0209 22:14:55.518659 22645 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:46217 => 172.151.52.24:9100 (request call id 831963) took 3001ms (client timeout 3000ms).
W0209 22:14:55.524289 26201 thread.cc:724] prepare [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.572s  user 0.003s sys 0.019s
W0209 22:14:55.524327 26201 thread.cc:689] prepare [worker] (thread pool) Time spent starting thread: real 0.572s    user 0.003s sys 0.019s
W0209 22:14:55.524341 26201 long_operation_tracker.cc:143] UpdateReplica took a long time: 30.941s
W0209 22:14:55.524359 26201 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:35476 => 172.151.52.24:9100 (request call id 830296) took 30940ms (client timeout 3000ms).
W0209 22:14:55.537264 27155 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:33081 => 172.151.52.24:9100 (request call id 835918) took 3001ms (client timeout 3000ms).
W0209 22:14:55.544761 21483 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:46690 => 172.151.52.24:9100 (request call id 835238) took 3001ms (client timeout 3000ms).
W0209 22:14:55.580829 23229 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:40630 => 172.151.52.24:9100 (request call id 835250) took 3000ms (client timeout 3000ms).
W0209 22:14:55.583854 24493 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:40546 => 172.151.52.24:9100 (request call id 831975) took 3001ms (client timeout 3000ms).
W0209 22:14:55.680920 23210 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:39971 => 172.151.52.24:9100 (request call id 831992) took 3000ms (client timeout 3000ms).
W0209 22:14:55.696316 22313 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:43913 => 172.151.52.24:9100 (request call id 831997) took 3001ms (client timeout 3000ms).
W0209 22:14:55.758811 27137 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:42019 => 172.151.52.24:9100 (request call id 835370) took 3001ms (client timeout 3000ms).
W0209 22:14:55.799962 21967 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 835976) took 3001ms (client timeout 3000ms).
W0209 22:14:55.805313 21679 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:39988 => 172.151.52.24:9100 (request call id 835978) took 3001ms (client timeout 3000ms).
W0209 22:14:55.810595 23368 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:44818 => 172.151.52.24:9100 (request call id 835378) took 3001ms (client timeout 3000ms).
W0209 22:14:55.853189 23611 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:45471 => 172.151.52.24:9100 (request call id 835983) took 3001ms (client timeout 3000ms).
W0209 22:14:55.891692 19980 thread.cc:724] consensus [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.572s    user 0.000s sys 0.010s
W0209 22:14:55.891721 19980 thread.cc:689] consensus [worker] (thread pool) Time spent starting thread: real 0.572s  user 0.000s sys 0.010s
W0209 22:14:55.907517 24654 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:36655 => 172.151.52.24:9100 (request call id 835399) took 3001ms (client timeout 3000ms).
W0209 22:14:55.923353 23515 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:39100 => 172.151.52.24:9100 (request call id 835403) took 3001ms (client timeout 3000ms).
W0209 22:14:55.942385 24837 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:42895 => 172.151.52.24:9100 (request call id 832063) took 3001ms (client timeout 3000ms).
W0209 22:14:55.961730 24038 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 836006) took 3000ms (client timeout 3000ms).
W0209 22:14:55.970737 21947 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:44766 => 172.151.52.24:9100 (request call id 832074) took 3000ms (client timeout 3000ms).
W0209 22:14:55.998471 23528 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:40546 => 172.151.52.24:9100 (request call id 832082) took 3001ms (client timeout 3000ms).
W0209 22:14:56.066412 22043 thread.cc:724] prepare [worker] (thread pool) Time spent waiting for new thread to publish its TID: real 0.542s  user 0.007s sys 0.014s
W0209 22:14:56.066450 22043 thread.cc:689] prepare [worker] (thread pool) Time spent starting thread: real 0.542s    user 0.007s sys 0.014s
W0209 22:14:56.066470 22043 long_operation_tracker.cc:143] UpdateReplica took a long time: 31.483s
I0209 22:14:56.066475 27125 raft_consensus.cc:615] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [term 5 FOLLOWER]: Fail of leader 045116cf294e40ffa4591045e494b2cc detected. Triggering leader pre-election, mode=NORMAL_ELECTION
I0209 22:14:56.066514 27125 raft_consensus.cc:3240] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [term 5 FOLLOWER]: Snoozing leader timeout detection for 3.334s
I0209 22:14:56.066530 27125 raft_consensus.cc:659] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [term 5 FOLLOWER]: Starting pre-election with config: opid_index: -1 peers { permanent_uuid: "045116cf294e40ffa4591045e494b2cc" member_type: VOTER last_known_private_addr { host: "172.151.22.253" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west->
I0209 22:14:56.066591 27125 leader_election.cc:216] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 6 pre-election: Requesting vote from peer 045116cf294e40ffa4591045e494b2cc
I0209 22:14:56.066614 27125 leader_election.cc:216] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 6 pre-election: Requesting vote from peer f6f4dd6f6fc94534ac0efc8ea48e781f
I0209 22:14:56.067034 24761 leader_election.cc:381] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 6 pre-election: Vote denied by peer f6f4dd6f6fc94534ac0efc8ea48e781f. Message: Invalid argument (yb/consensus/raft_consensus.cc:2826): T f3521e85b8bb4cc0aa1be2b8d83bf002 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 5 FOLLOWER]:  Leader pre-election vote request: Denying vote to candid>
I0209 22:14:56.067703 21326 leader_election.cc:381] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 6 pre-election: Vote denied by peer 045116cf294e40ffa4591045e494b2cc. Message: Invalid argument (yb/consensus/raft_consensus.cc:2826): T f3521e85b8bb4cc0aa1be2b8d83bf002 P 045116cf294e40ffa4591045e494b2cc [term 5 LEADER]:  Leader pre-election vote request: Denying vote to candidat>
I0209 22:14:56.067734 21326 leader_election.cc:242] T f3521e85b8bb4cc0aa1be2b8d83bf002 P 12a83ca6c58b432294303929b83b03f6 [CANDIDATE]: Term 6 pre-election: Election decided. Result: candidate lost.
W0209 22:14:56.066490 22043 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 830297) took 31482ms (client timeout 3000ms).
W0209 22:14:56.082513 24719 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:41700 => 172.151.52.24:9100 (request call id 835433) took 3001ms (client timeout 3000ms).
W0209 22:14:56.145970 21309 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:42463 => 172.151.52.24:9100 (request call id 836049) took 3001ms (client timeout 3000ms).
W0209 22:14:56.176609 23274 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:45471 => 172.151.52.24:9100 (request call id 836058) took 3001ms (client timeout 3000ms).
W0209 22:14:56.301926 22635 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.22.253:38586 => 172.151.52.24:9100 (request call id 836082) took 3000ms (client timeout 3000ms).
W0209 22:14:56.302778 24280 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.37.182:41700 => 172.151.52.24:9100 (request call id 835490) took 3001ms (client timeout 3000ms).
W0209 22:14:56.304690 24500 yb_rpc.cc:392] Call yb.consensus.ConsensusService.UpdateConsensus 172.151.25.18:44453 => 172.151.52.24:9100 (request call id 832139) took 3001ms (client timeout 3000ms).
W0209 22:14:54.800132 20073 long_operation_tracker.cc:114] UpdateReplica running for 30.226s in thread 23903:
    @     0x7f8dc3ff4b39  __lll_lock_wait
    @     0x7f8dc3fef6e2  __GI___pthread_mutex_lock
    @          0x3763a7c  yb::ThreadPool::DoSubmit()
    @          0x376532a  yb::ThreadPoolToken::SubmitFunc()
    @          0x348f5bd  yb::tablet::Preparer::Submit()
    @          0x34869f5  yb::tablet::OperationDriver::ExecuteAsync()
    @          0x34f1932  yb::tablet::TabletPeer::SetPropagatedSafeTime()
    @          0x3041fc6  yb::consensus::RaftConsensus::UpdateReplica()
    @          0x302a7f7  yb::consensus::RaftConsensus::Update()
    @          0x361ec76  yb::tserver::ConsensusServiceImpl::UpdateConsensus()
    @          0x30845ce  std::__1::__function::__func<>::operator()()
    @          0x3085135  yb::consensus::ConsensusServiceIf::Handle()
    @          0x341faea  yb::rpc::ServicePoolImpl::Handle()
    @          0x33dfc34  yb::rpc::InboundCall::InboundCallTask::Run()
    @          0x342d794  yb::rpc::(anonymous namespace)::Worker::Execute()
    @          0x375c19a  yb::Thread::SuperviseThread()

172.151.37.182 (peer):

I0209 22:14:51.051560 20070 raft_consensus.cc:2825] T e72dbb882d2f492ab5b159894ac9783a P f6f4dd6f6fc94534ac0efc8ea48e781f [term 1 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 9223367332.018s
I0209 22:14:52.532610 20074 replica_state.cc:1292] T 283b7349f75c4cfd908886fbd0cbafca P f6f4dd6f6fc94534ac0efc8ea48e781f [term 5 LEADER]: Revoked old leader 12a83ca6c58b432294303929b83b03f6 ht lease: { physical: 1644444867118598 }
I0209 22:14:52.532824 19289 raft_consensus.cc:2825] T e9bc116ed92f44e2b98f2a99348e07d2 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 3 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 4 because replica is either leader or believes a valid leader to be alive. Time left: 2.903s
I0209 22:14:52.532891 20071 raft_consensus.cc:2825] T 8b6c593f71564ac68f5c4a1f8d19e8ce P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.783s
I0209 22:14:52.753959 19293 raft_consensus.cc:2825] T ba4b2fca9751476c98b9aadcd0e0eb98 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 1 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 9223367330.315s
I0209 22:14:52.755465 20072 raft_consensus.cc:2825] T 6bc187957ea445e5b159fa7d32997f51 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 1 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 2.811s
I0209 22:14:52.755466 20064 raft_consensus.cc:2825] T 2418a3301a5544a89dfce16c34a6d16e P f6f4dd6f6fc94534ac0efc8ea48e781f [term 7 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 8 because replica is either leader or believes a valid leader to be alive. Time left: 2.483s
I0209 22:14:52.755475 20057 raft_consensus.cc:2825] T a2c8bb4a8ae44385ade6a89a67cc7eae P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.546s
I0209 22:14:52.755704 19295 raft_consensus.cc:2825] T c6c4e9844ebc48c98e9fd3eac502ff8b P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.978s
I0209 22:14:52.755951 19283 raft_consensus.cc:2825] T e9454cb5229d4a59a72d62cc393f0e38 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 5 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 6 because replica is either leader or believes a valid leader to be alive. Time left: 9223367330.313s
W0209 22:14:54.126225 19295 consensus_peers.cc:543] T a1a566686d5c4f9696227c01f43d1eed P f6f4dd6f6fc94534ac0efc8ea48e781f -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Timed out (yb/rpc/outbound_call.cc:493): UpdateConsensus RPC (request call id 834946) to 172.151.52.24:9100 timed out after 3.000s. Retrying in the next heartbeat period. Already tried 1 times. State: 2
I0209 22:14:54.227099 19295 raft_consensus.cc:2825] T ba65d1fb305640e5aa40e45cb228da41 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.920s
I0209 22:14:54.581224 19295 tablet.cc:1831] T 791277af61154ed4929c319eb2b1e203 P f6f4dd6f6fc94534ac0efc8ea48e781f: Alter schema from Schema [
        0:c1[float NOT NULL NOT A PARTITION KEY],
        1:c0[string NOT NULL NOT A PARTITION KEY]
]
properties: contain_counters: false is_transactional: true consistency_level: STRONG use_mangled_column_name: false is_ysql_catalog_table: false retain_delete_markers: false
pgtable_id: 17370 version 0 to Schema [
        0:c1[float NOT NULL NOT A PARTITION KEY],
        1:c0[string NOT NULL NOT A PARTITION KEY]
]
properties: contain_counters: false is_transactional: true consistency_level: STRONG use_mangled_column_name: false is_ysql_catalog_table: false retain_delete_markers: false version 1
I0209 22:14:54.581279 19295 thread_pool.cc:170] Starting thread pool { name: TransactionManager queue_limit: 500 max_workers: 50 }
I0209 22:14:54.581463 19295 meta_data_cache.cc:77] Creating a metadata cache without a permissions cache
I0209 22:14:56.066828 20070 raft_consensus.cc:2825] T f3521e85b8bb4cc0aa1be2b8d83bf002 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 5 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 6 because replica is either leader or believes a valid leader to be alive. Time left: 2.746s
I0209 22:14:56.582834 20062 raft_consensus.cc:2825] T 4832270ae48747a2a2fa398815e0e017 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 1 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 9223367326.486s
I0209 22:14:57.035347 19292 raft_consensus.cc:2825] T b69e470c5f6c4424ba1f289fa0f24144 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.823s
I0209 22:14:57.153939 20071 raft_consensus.cc:2825] T 5f8d6665077f4d7d8020cd0e0c0aa908 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 1 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 9223367325.915s
I0209 22:14:57.593396 19290 raft_consensus.cc:2825] T 109429407e87471483a8b41d4c5d6108 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 1 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 2 because replica is either leader or believes a valid leader to be alive. Time left: 9223367325.476s
W0209 22:14:59.276396 20075 consensus_peers.cc:543] T 6495e72ab08b457b8dc3dd1980713b0c P f6f4dd6f6fc94534ac0efc8ea48e781f -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Timed out (yb/rpc/outbound_call.cc:493): UpdateConsensus RPC (request call id 836169) to 172.151.52.24:9100 timed out after 3.000s. Retrying in the next heartbeat period. Already tried 2 times. State: 2
I0209 22:14:59.751998 20068 raft_consensus.cc:2825] T ea5c96cd671b495e90336cd7a8ace68f P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.968s
I0209 22:15:02.245222 20071 raft_consensus.cc:2825] T 7dd606077c2a431a8417cb79e36cd801 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 3 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 4 because replica is either leader or believes a valid leader to be alive. Time left: 2.604s
I0209 22:15:04.159500 20062 raft_consensus.cc:2825] T 3a315f267ee5487c857aef39237b3d59 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.627s
I0209 22:15:04.159600 19280 raft_consensus.cc:2825] T ed7a1ff6652645f0a524863bd5449b29 P f6f4dd6f6fc94534ac0efc8ea48e781f [term 2 FOLLOWER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 2.939s

Back to master, looked into database10:

I0209 22:05:22.757627 18588 catalog_loaders.cc:314] Loaded metadata for namespace database10 [id=0000434c000030008000000000000000]

Looked into that database id:

I0209 22:05:22.730885 18588 catalog_loaders.cc:254] Loaded metadata for tablet 791277af61154ed4929c319eb2b1e203 (first table 0000434c000030008000000000000000.colocated.parent.tablename [id=0000434c000030008000000000000000.colocated.parent.uuid])

Looked into tablet 791277af61154ed4929c319eb2b1e203:

I0209 22:05:23.318949 19953 catalog_entity_info.cc:114] T 791277af61154ed4929c319eb2b1e203: Leader changed from <NULL> to 0x00000000081be018 -> { permanent_uuid: 39d07ea3c84f4661bbea6c4eea007063 registration: common { private_rpc_addresses { host: "172.151.25.18" port: 9100 } http_addresses { host: "172.151.25.18" port: 9000 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } placement_uuid: "1f0742a4-5651-4b12-8ffb-aa0419872c7a" pg_port: 5433 } capabilities: 2189743739 capabilities: 1427296937 capabilities: 2980225056 placement_id: aws:us-west-2:us-west-2a }
I0209 22:05:23.318964 19953 catalog_manager.cc:6001] Tablet: 791277af61154ed4929c319eb2b1e203 reported consensus state change. New consensus state: current_term: 2 leader_uuid: "39d07ea3c84f4661bbea6c4eea007063" config { opid_index: -1 peers { permanent_uuid: "39d07ea3c84f4661bbea6c4eea007063" member_type: VOTER last_known_private_addr { host: "172.151.25.18" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } } peers { permanent_uuid: "12a83ca6c58b432294303929b83b03f6" member_type: VOTER last_known_private_addr { host: "172.151.52.24" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2c" } } peers { permanent_uuid: "f6f4dd6f6fc94534ac0efc8ea48e781f" member_type: VOTER last_known_private_addr { host: "172.151.37.182" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2b" } } } from 39d07ea3c84f4661bbea6c4eea007063
...
I0209 22:05:23.476043 19908 catalog_manager.cc:6011] Peer f6f4dd6f6fc94534ac0efc8ea48e781f sent full tablet report for 791277af61154ed4929c319eb2b1e203, prev state op id: -1, prev state term: 2, prev state has_leader_uuid: 1. Consensus state: current_term: 2 leader_uuid: "39d07ea3c84f4661bbea6c4eea007063" config { opid_index: -1 peers { permanent_uuid: "39d07ea3c84f4661bbea6c4eea007063" member_type: VOTER last_known_private_addr { host: "172.151.25.18" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } } peers { permanent_uuid: "12a83ca6c58b432294303929b83b03f6" member_type: VOTER last_known_private_addr { host: "172.151.52.24" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2c" } } peers { permanent_uuid: "f6f4dd6f6fc94534ac0efc8ea48e781f" member_type: VOTER last_known_private_addr { host: "172.151.37.182" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2b" } } }
I0209 22:05:38.319118 19951 catalog_manager.cc:6011] Peer 12a83ca6c58b432294303929b83b03f6 sent full tablet report for 791277af61154ed4929c319eb2b1e203, prev state op id: -1, prev state term: 2, prev state has_leader_uuid: 1. Consensus state: current_term: 2 leader_uuid: "39d07ea3c84f4661bbea6c4eea007063" config { opid_index: -1 peers { permanent_uuid: "39d07ea3c84f4661bbea6c4eea007063" member_type: VOTER last_known_private_addr { host: "172.151.25.18" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2a" } } peers { permanent_uuid: "12a83ca6c58b432294303929b83b03f6" member_type: VOTER last_known_private_addr { host: "172.151.52.24" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2c" } } peers { permanent_uuid: "f6f4dd6f6fc94534ac0efc8ea48e781f" member_type: VOTER last_known_private_addr { host: "172.151.37.182" port: 9100 } cloud_info { placement_cloud: "aws" placement_region: "us-west-2" placement_zone: "us-west-2b" } } }

Looked for that same tablet id in the logs of the tserver where the leader was at:

I0209 22:05:25.021890 20059 raft_consensus.cc:2825] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367898.054s
I0209 22:06:45.053979 20074 raft_consensus.cc:2825] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367818.022s
I0209 22:08:45.470374 18656 raft_consensus.cc:2825] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367697.606s
I0209 22:14:45.922820 20075 raft_consensus.cc:2825] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [term 2 LEADER]:  Leader pre-election vote request: Denying vote to candidate 12a83ca6c58b432294303929b83b03f6 for term 3 because replica is either leader or believes a valid leader to be alive. Time left: 9223367337.153s
I0209 22:14:54.580808 20065 tablet.cc:1831] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Alter schema from Schema [
I0209 22:15:05.042937 19926 consensus_queue.cc:1183] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.2 next_index: 3 last_known_committed_idx: 2 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.2 }
I0209 22:19:02.571055 18656 consensus_queue.cc:1183] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.2 next_index: 3 last_known_committed_idx: 2 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.2 }
I0209 22:19:02.571146 18656 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 4 ops from disk.
I0209 22:19:04.678282  6228 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 1 ops from disk.
I0209 22:20:40.041798 20059 consensus_queue.cc:1183] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.2 next_index: 3 last_known_committed_idx: 2 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.2 }
I0209 22:20:40.041877 20059 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 4 ops from disk.
I0209 22:20:41.556700  6230 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 1 ops from disk.
W0209 22:22:34.943086 19928 consensus_peers.cc:543] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Illegal state (yb/tserver/service_util.cc:252): Tablet not RUNNING: NOT_STARTED (tablet server error 12) (raft group state error 5). Retrying in the next heartbeat period. Already tried 179 times. State: 2
I0209 22:22:37.523238 20074 consensus_queue.cc:1183] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.2 next_index: 3 last_known_committed_idx: 2 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.2 }
I0209 22:22:37.523330 20074 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 4 ops from disk.
I0209 22:22:39.131155  6228 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 1 ops from disk.
I0209 22:25:13.576788 19226 consensus_queue.cc:1183] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.2 next_index: 3 last_known_committed_idx: 2 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.2 }
I0209 22:25:13.576862 19226 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 4 ops from disk.
I0209 22:25:15.441973 20061 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 1 ops from disk.
W0209 22:26:23.518038 20076 consensus_peers.cc:543] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Timed out (yb/rpc/outbound_call.cc:493): UpdateConsensus RPC (request call id 1047591) to 172.151.52.24:9100 timed out after 3.000s. Retrying in the next heartbeat period. Already tried 133 times. State: 2
I0209 22:28:42.367475 19226 consensus_queue.cc:1183] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 [LEADER]: Got LMP mismatch error from peer: { peer: 12a83ca6c58b432294303929b83b03f6 is_new: 0 last_received: 2.2 next_index: 3 last_known_committed_idx: 2 is_last_exchange_successful: 0 needs_remote_bootstrap: 0 member_type: VOTER num_sst_files: 0 last_applied: 2.2 }
I0209 22:28:42.367605 19226 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 4 ops from disk.
I0209 22:28:45.673841  6228 log_cache.cc:348] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063: Successfully read 1 ops from disk.
W0209 22:29:48.200246 20074 consensus_peers.cc:543] T 791277af61154ed4929c319eb2b1e203 P 39d07ea3c84f4661bbea6c4eea007063 -> Peer 12a83ca6c58b432294303929b83b03f6 ([host: "172.151.52.24" port: 9100], []): Couldn't send request.  Status: Network error (yb/util/net/socket.cc:540): recvmsg error: Connection refused (system error 111). Retrying in the next heartbeat period. Already tried 69 times. State: 2

These LMP mismatch errors coincide with the FATALs on the peer:

F0209 22:15:05.043612 23160 tablet.cc:3662] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203
F0209 22:19:02.571805 28003 tablet.cc:3662] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203
F0209 22:20:40.042645 29352 tablet.cc:3662] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203
F0209 22:22:37.524107 30158 tablet.cc:3662] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203
F0209 22:25:13.577569 30981 tablet.cc:3662] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203
F0209 22:29:17.564538  1112 tablet.cc:3662] Check failed: _s.ok() Bad status: Not found (yb/tablet/tablet_metadata.cc:349): Table <unknown_table_name> (0000434c0000300080000000000043da) not found in Raft group 791277af61154ed4929c319eb2b1e203

To summarize, it looks like the peer 172.151.52.24 was bad before CREATE TABLE (logs keep showing update consensus client timeout), and since then, it FATALs with table not found in raft group while the leader tserver says "Got LMP mismatch error from peer". Didn't look further than that.

Huqicheng commented 1 year ago

Fixed by https://github.com/yugabyte/yugabyte-db/commit/a5455f33bdbde7162a8585d6c20ca2335ca03329