Open wzw88486969 opened 3 years ago
[05/23/21 05:16:20-346 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=3315714752521 is going to quit [05/23/21 05:16:22-353 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=6605659701249 is going to quit, group config_group [05/23/21 05:16:22-354 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=6605659701249 is going to quit
这几行很奇怪,中间这一行日志是收到term更高的主之后才打印的,但是leader step down的时候,两个replicator是一起stop的。这意味着,6605659701249 这个 replicator 至少过了 2s 都没执行 stop。不太像网络问题,需要看下 bthread worker usage 当时是否非常高,导致没有足够的线程来运行 bthread
通过大盘看当时这个pod cpu使用率正常,不太好想那里出问题了
leader: [05/23/21 05:16:17-554 +08:00][9-18][info][raft_server.cpp:205][set] RaftServer::set [05/23/21 05:16:17-558 +08:00][9-31][info][raft_server.cpp:362][on_apply] start calling on_apply [05/23/21 05:16:17-558 +08:00][9-31][info][raft_server.cpp:366][on_apply] foreach propose [05/23/21 05:16:17-558 +08:00][9-31][info][raft_server.cpp:373][on_apply] calling propose iter.done [05/23/21 05:16:17-562 +08:00][9-31][info][raft_server.cpp:410][on_apply] end calling on_apply [05/23/21 05:16:20-029 +08:00][9-31][warning][log.cpp:32][OnLogMessage] [raft_log] Group config_group fail to issue RPC to edge-cluster-2:18100:0 _consecutive_error_times=1, [E1008]Reached timeout=150ms @..189:18100 [05/23/21 05:16:20-046 +08:00][9-18][warning][log.cpp:32][OnLogMessage] [raft_log] Group config_group fail to issue RPC to edge-cluster-0:18100:0 _consecutive_error_times=1, [E1008]Reached timeout=150ms @..178:18100 [05/23/21 05:16:20-335 +08:00][9-31][warning][log.cpp:32][OnLogMessage] [raft_log] node config_group:edge-cluster-1:18100:0 term 3 steps down when alive nodes don't satisfy quorum dead_nodes: edge-cluster-0:18100:0,edge-cluster-2:18100:0 conf: edge-cluster-0:18100:0,edge-cluster-1:18100:0,edge-cluster-2:18100:0 [05/23/21 05:16:20-339 +08:00][9-13][info][raft_server.cpp:347][on_leader_stop] node stepped down from leader:Majority of the group dies [05/23/21 05:16:20-344 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Group config_group Replicator=3315714752521 is going to quit [05/23/21 05:16:20-346 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=3315714752521 is going to quit [05/23/21 05:16:22-353 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=6605659701249 is going to quit, group config_group [05/23/21 05:16:22-354 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=6605659701249 is going to quit
follow: [05/23/21 05:16:17-572 +08:00][9-13][info][raft_server.cpp:362][on_apply] start calling on_apply [05/23/21 05:16:17-573 +08:00][9-13][info][raft_server.cpp:366][on_apply] foreach propose [05/23/21 05:16:17-573 +08:00][9-13][info][raft_server.cpp:398][on_apply] propose iter.done is null [05/23/21 05:16:17-576 +08:00][9-13][info][raft_server.cpp:410][on_apply] end calling on_apply [05/23/21 05:16:22-302 +08:00][9-32][info][raft_server.cpp:577][on_stop_following] on_stop_following leader ip:edge-cluster-1:18100:0 [05/23/21 05:16:22-303 +08:00][9-18][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-0:18100:0 term 3 start pre_vote [05/23/21 05:16:22-316 +08:00][9-12][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-0:18100:0 received PreVoteResponse from edge-cluster-2:18100:0 term 3 granted 1 [05/23/21 05:16:22-318 +08:00][9-12][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-0:18100:0 term 3 start vote and grant vote self [05/23/21 05:16:22-328 +08:00][9-12][info][log.cpp:26][OnLogMessage] [raft_log] Saved single stable meta, path ./data/raft_meta term 4 votedfor edge-cluster-0:18100:0 time: 1999 [05/23/21 05:16:22-342 +08:00][9-13][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-0:18100:0 received RequestVoteResponse from edge-cluster-2:18100:0 term 4 granted 1 [05/23/21 05:16:22-344 +08:00][9-13][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-0:18100:0 term 4 become leader of group edge-cluster-0:18100:0,edge-cluster-1:18100:0,edge-cluster-2:18100:0 [05/23/21 05:16:22-349 +08:00][9-13][info][log.cpp:26][OnLogMessage] [raft_log] brpc::policy::DomainNamingService("edge-cluster-1:18100"): added 1 [05/23/21 05:16:22-350 +08:00][9-13][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=3199750640941@edge-cluster-1:18100:0 is started, group config_group [05/23/21 05:16:22-381 +08:00][9-13][info][log.cpp:26][OnLogMessage] [raft_log] brpc::policy::DomainNamingService("edge-cluster-2:18100"): added 1 [05/23/21 05:16:22-384 +08:00][9-13][info][log.cpp:26][OnLogMessage] [raft_log] Replicator=15801184687801@edge-cluster-2:18100:0 is started, group config_group [05/23/21 05:16:22-420 +08:00][9-31][info][raft_server.cpp:550][on_configuration_committed] peers:edge-cluster-1:18100:0 [05/23/21 05:16:22-422 +08:00][9-31][info][raft_server.cpp:550][on_configuration_committed] peers:edge-cluster-2:18100:0 [05/23/21 05:16:22-425 +08:00][9-31][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-0:18100:0 reset ConfigurationCtx, new_peers: edge-cluster-0:18100:0,edge-cluster-1:18100:0,edge-cluster-2:18100:0, old_peers: edge-cluster-0:18100:0,edge-cluster-1:18100:0,edge-cluster-2:18100:0 [05/23/21 05:16:22-433 +08:00][9-31][info][raft_server.cpp:331][on_leader_start] node becomes leader
follow: [05/23/21 05:16:20-470 +08:00][10-15][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-2:18100:0 starts to do snapshot [05/23/21 05:16:20-472 +08:00][10-19][info][raft_server.cpp:456][on_snapshot_save] start calling on_snapshot_save [05/23/21 05:16:20-473 +08:00][10-19][info][raft_server.cpp:419][save_snapshot] start calling save_snapshot [05/23/21 05:16:20-473 +08:00][10-30][info][raft_server.cpp:476][on_snapshot_save] end calling on_snapshot_save [05/23/21 05:16:20-478 +08:00][10-19][info][raft_server.cpp:448][save_snapshot] end calling save_snapshot [05/23/21 05:16:20-481 +08:00][10-19][info][log.cpp:26][OnLogMessage] [raft_log] Deleting ./data/snapshot/snapshot_00000000000000822753 [05/23/21 05:16:20-481 +08:00][10-19][info][log.cpp:26][OnLogMessage] [raft_log] Renaming ./data/snapshot/temp to ./data/snapshot/snapshot_00000000000000822753 [05/23/21 05:16:20-482 +08:00][10-19][info][log.cpp:26][OnLogMessage] [raft_log] Deleting ./data/snapshot/snapshot_00000000000000822744 [05/23/21 05:16:20-483 +08:00][10-19][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-2:18100:0 snapshot_save_done, last_included_index=822753 last_included_term=3 [05/23/21 05:16:20-485 +08:00][10-19][info][log.cpp:26][OnLogMessage] [raft_log] log save_meta ./data/log/log_meta first_log_index: 822745 time: 1514 [05/23/21 05:16:22-314 +08:00][10-19][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-2:18100:0 received PreVote from edge-cluster-0:18100:0 in term 4 current_term 3 granted 1 [05/23/21 05:16:22-328 +08:00][10-14][info][log.cpp:26][OnLogMessage] [raft_log] node config_group:edge-cluster-2:18100:0 received RequestVote from edge-cluster-0:18100:0 in term 4 current_term 3 [05/23/21 05:16:22-330 +08:00][10-19][info][raft_server.cpp:577][on_stop_following] on_stop_following leader ip:edge-cluster-1:18100:0 [05/23/21 05:16:22-336 +08:00][10-14][info][log.cpp:26][OnLogMessage] [raft_log] Saved single stable meta, path ./data/raft_meta term 4 votedfor :0:0 time: 3548 [05/23/21 05:16:22-339 +08:00][10-14][info][log.cpp:26][OnLogMessage] [raft_log] Saved single stable meta, path ./data/raft_meta term 4 votedfor edge-cluster-0:18100:0 time: 1882 [05/23/21 05:16:22-351 +08:00][10-30][warning][log.cpp:32][OnLogMessage] [raft_log] node config_group:edge-cluster-2:18100:0 ignore stale AppendEntries from edge-cluster-1:18100:0 in term 3 current_term 4 [05/23/21 05:16:22-394 +08:00][10-14][info][raft_server.cpp:567][on_start_following] on_start_following leader ip:edge-cluster-0:18100:0 [05/23/21 05:16:22-422 +08:00][10-19][info][raft_server.cpp:550][on_configuration_committed] peers:edge-cluster-0:18100:0 [05/23/21 05:16:22-423 +08:00][10-19][info][raft_server.cpp:550][on_configuration_committed] peers:edge-cluster-1:18100:0