vesoft-inc / nebula

A distributed, fast open-source graph database featuring horizontal scalability and high availability
https://nebula-graph.io
Apache License 2.0
10.4k stars 1.17k forks source link

【Storage OFFLINE】Storage is offline after restart all storage simultaneously #5611

Open mo-avatar opened 1 year ago

mo-avatar commented 1 year ago

Env: linux x86_64 nebula version: 3.2.1 problem: +-----------------+------+-----------+-----------+--------------+---------+ | Host | Port | Status | Role | Git Info Sha | Version | +-----------------+------+-----------+-----------+--------------+---------+ | "xxx" | 9779 | "OFFLINE" | "STORAGE" | "xxx" | "3.2.1" | | "xxx" | 9779 | "OFFLINE" | "STORAGE" | "xxx" | "3.2.1" | | "xxx" | 9779 | "OFFLINE" | "STORAGE" | "xxx" | "3.2.1" | +-----------------+------+-----------+-----------+--------------+---------+

nebula storage log:

20230627 16:27:28.941357 1122219 RaftPart.cpp:1427] [Port: 9780, Space: 709, Part: 34] AskForVoteRequest has been sent to all peers, waiting for responses I20230627 16:27:28.941382 1122219 RaftPart.cpp:1355] [Port: 9780, Space: 709, Part: 34] Did not get enough votes from election of term 30, isPreVote = 1 I20230627 16:27:28.941486 1122307 RaftPart.cpp:1389] [Port: 9780, Space: 767, Part: 31] Sending out an election request (space = 767, part = 31, term = 38, lastLogId = 190538, lastLogTerm = 29, candidateIP = xx.xx.xx.142, candidatePort = 9780), isPreVote = 1 I20230627 16:27:28.941498 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 767, Part: 31] Sending AskForVoteRequest to [Port: 9780, Space: 767, Part: 31] [Host: xx.xx.xx.117:9780] I20230627 16:27:28.941505 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 767, Part: 31] Sending AskForVoteRequest to [Port: 9780, Space: 767, Part: 31] [Host: xx.xx.xx.136:9780] I20230627 16:27:30.207652 1122260 RaftPart.cpp:1427] [Port: 9780, Space: 767, Part: 31] AskForVoteRequest has been sent to all peers, waiting for responses I20230627 16:27:30.207690 1122260 RaftPart.cpp:1355] [Port: 9780, Space: 767, Part: 31] Did not get enough votes from election of term 38, isPreVote = 1 I20230627 16:27:30.207732 1122307 RaftPart.cpp:1389] [Port: 9780, Space: 772, Part: 2] Sending out an election request (space = 772, part = 2, term = 37, lastLogId = 200598, lastLogTerm = 32, candidateIP = xx.xx.xx.142, candidatePort = 9780), isPreVote = 1 I20230627 16:27:30.207751 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 772, Part: 2] Sending AskForVoteRequest to [Port: 9780, Space: 772, Part: 2] [Host: xx.xx.xx.117:9780] I20230627 16:27:30.207759 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 772, Part: 2] Sending AskForVoteRequest to [Port: 9780, Space: 772, Part: 2] [Host: xx.xx.xx.136:9780] I20230627 16:27:31.734575 1122219 RaftPart.cpp:1427] [Port: 9780, Space: 772, Part: 2] AskForVoteRequest has been sent to all peers, waiting for responses I20230627 16:27:31.734617 1122219 RaftPart.cpp:1355] [Port: 9780, Space: 772, Part: 2] Did not get enough votes from election of term 37, isPreVote = 1 I20230627 16:27:31.734673 1122307 RaftPart.cpp:1389] [Port: 9780, Space: 714, Part: 6] Sending out an election request (space = 714, part = 6, term = 61, lastLogId = 193807, lastLogTerm = 48, candidateIP = xx.xx.xx.142, candidatePort = 9780), isPreVote = 1 I20230627 16:27:31.734685 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 714, Part: 6] Sending AskForVoteRequest to [Port: 9780, Space: 714, Part: 6] [Host: xx.xx.xx.117:9780] I20230627 16:27:31.734697 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 714, Part: 6] Sending AskForVoteRequest to [Port: 9780, Space: 714, Part: 6] [Host: xx.xx.xx.136:9780] I20230627 16:27:32.365975 1122260 RaftPart.cpp:1427] [Port: 9780, Space: 714, Part: 6] AskForVoteRequest has been sent to all peers, waiting for responses I20230627 16:27:32.366012 1122260 RaftPart.cpp:1355] [Port: 9780, Space: 714, Part: 6] Did not get enough votes from election of term 61, isPreVote = 1 I20230627 16:27:32.366060 1122307 RaftPart.cpp:1389] [Port: 9780, Space: 755, Part: 2] Sending out an election request (space = 755, part = 2, term = 45, lastLogId = 191750, lastLogTerm = 41, candidateIP = xx.xx.xx.142, candidatePort = 9780), isPreVote = 1 I20230627 16:27:32.366071 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 755, Part: 2] Sending AskForVoteRequest to [Port: 9780, Space: 755, Part: 2] [Host: xx.xx.xx.117:9780] I20230627 16:27:32.366082 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 755, Part: 2] Sending AskForVoteRequest to [Port: 9780, Space: 755, Part: 2] [Host: xx.xx.xx.136:9780] I20230627 16:27:32.825194 1122226 RaftPart.cpp:1533] [Port: 9780, Space: 770, Part: 5] Received a VOTING request: space = 770, partition = 5, candidateAddr = xx.xx.xx.136:9780, term = 34, lastLogId = 200598, lastLogTerm = 27, isPreVote = 1 I20230627 16:27:32.825206 1122226 RaftPart.cpp:1635] [Port: 9780, Space: 770, Part: 5] The partition will vote for the candidate "xx.xx.xx.136":9780, isPreVote = 1 I20230627 16:27:32.825246 1122226 RaftPart.cpp:1533] [Port: 9780, Space: 773, Part: 21] Received a VOTING request: space = 773, partition = 21, candidateAddr = xx.xx.xx.117:9780, term = 37, lastLogId = 200443, lastLogTerm = 31, isPreVote = 0 I20230627 16:27:32.825258 1122226 RaftPart.cpp:1635] [Port: 9780, Space: 773, Part: 21] The partition will vote for the candidate "xx.xx.xx.117":9780, isPreVote = 0 I20230627 16:27:33.148267 1122234 RaftPart.cpp:1427] [Port: 9780, Space: 755, Part: 2] AskForVoteRequest has been sent to all peers, waiting for responses I20230627 16:27:33.148290 1122234 RaftPart.cpp:1344] [Port: 9780, Space: 755, Part: 2] Partition win prevote of term 45 I20230627 16:27:33.148360 1122307 RaftPart.cpp:1389] [Port: 9780, Space: 755, Part: 2] Sending out an election request (space = 755, part = 2, term = 45, lastLogId = 191750, lastLogTerm = 41, candidateIP = xx.xx.xx.142, candidatePort = 9780), isPreVote = 0 I20230627 16:27:33.148371 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 755, Part: 2] Sending AskForVoteRequest to [Port: 9780, Space: 755, Part: 2] [Host: xx.xx.xx.117:9780] I20230627 16:27:33.148382 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 755, Part: 2] Sending AskForVoteRequest to [Port: 9780, Space: 755, Part: 2] [Host: xx.xx.xx.136:9780] I20230627 16:27:33.483222 1122219 RaftPart.cpp:1533] [Port: 9780, Space: 772, Part: 13] Received a VOTING request: space = 772, partition = 13, candidateAddr = xx.xx.xx.136:9780, term = 38, lastLogId = 200459, lastLogTerm = 33, isPreVote = 1 I20230627 16:27:33.483259 1122219 RaftPart.cpp:1635] [Port: 9780, Space: 772, Part: 13] The partition will vote for the candidate "xx.xx.xx.136":9780, isPreVote = 1 I20230627 16:27:34.645862 1122226 RaftPart.cpp:1533] [Port: 9780, Space: 771, Part: 20] Received a VOTING request: space = 771, partition = 20, candidateAddr = xx.xx.xx.136:9780, term = 40, lastLogId = 198962, lastLogTerm = 37, isPreVote = 0 I20230627 16:27:34.645882 1122226 RaftPart.cpp:1635] [Port: 9780, Space: 771, Part: 20] The partition will vote for the candidate "xx.xx.xx.136":9780, isPreVote = 0 I20230627 16:27:36.091497 1122219 RaftPart.cpp:1427] [Port: 9780, Space: 755, Part: 2] AskForVoteRequest has been sent to all peers, waiting for responses I20230627 16:27:36.091526 1122219 RaftPart.cpp:1355] [Port: 9780, Space: 755, Part: 2] Did not get enough votes from election of term 45, isPreVote = 0 I20230627 16:27:36.099509 1122307 RaftPart.cpp:1389] [Port: 9780, Space: 733, Part: 4] Sending out an election request (space = 733, part = 4, term = 35, lastLogId = 193822, lastLogTerm = 32, candidateIP = xx.xx.xx.142, candidatePort = 9780), isPreVote = 1 I20230627 16:27:36.099520 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 733, Part: 4] Sending AskForVoteRequest to [Port: 9780, Space: 733, Part: 4] [Host: xx.xx.xx.117:9780] I20230627 16:27:36.099531 1122307 RaftPart.cpp:1411] [Port: 9780, Space: 733, Part: 4] Sending AskForVoteRequest to [Port: 9780, Space: 733, Part: 4] [Host: xx.xx.xx.136:9780]

Heartbeat can be sent in every few minutes but the heartbeat_interval_secs is only 10 seconds.

QingZ11 commented 9 months ago

Have you restarted the Meta service? Has the Storage service been activated (through the add hosts command)?