sofastack / sofa-jraft

A production-grade java implementation of RAFT consensus algorithm.
https://www.sofastack.tech/projects/sofa-jraft/
Apache License 2.0
3.58k stars 1.15k forks source link

nacos2.2.3 引用jraft1.3.12 出现 org.rocksdb.RocksDBException: end key comes before start key 报错 #1029

Closed skippyb1 closed 1 month ago

skippyb1 commented 1 year ago

2023-09-23 06:01:18,449 INFO Node <naming_instance_metadata/xxx.250:7848> received PreVoteResponse from xxx.249:7848, term=4, granted=false.

2023-09-23 06:01:18,514 INFO Node <naming_instance_metadata/xxx.250:7848> received PreVoteResponse from xxx.251:7848, term=4, granted=true.

2023-09-23 06:01:18,515 INFO Node <naming_instance_metadata/xxx.250:7848> start vote and grant vote self, term=4.

2023-09-23 06:01:29,797 INFO Save raft meta, path=/app/nacos2/20230905/nacos/data/protocol/raft/naming_instance_metadata/meta-data, term=5, votedFor=xxx.250:7848, cost time=6287 ms

2023-09-23 06:01:29,799 WARN Node <naming_instance_metadata/xxx.250:7848> ignore stale AppendEntriesRequest from xxx.251:7848, term=4, currTerm=5.

2023-09-23 06:01:29,799 WARN Node <naming_instance_metadata/xxx.250:7848> ignore stale AppendEntriesRequest from xxx.251:7848, term=4, currTerm=5.

2023-09-23 06:01:29,800 WARN Node <naming_instance_metadata/xxx.250:7848> ignore stale AppendEntriesRequest from xxx.251:7848, term=4, currTerm=5.

2023-09-23 06:01:29,800 WARN Node <naming_instance_metadata/xxx.250:7848> ignore stale AppendEntriesRequest from xxx.251:7848, term=4, currTerm=5.

2023-09-23 06:01:29,801 INFO Node <naming_instance_metadata/xxx.250:7848> become leader of group, term=5, conf=xxx.250:7848,xxx.249:7848,xxx.251:7848, oldConf=.

2023-09-23 06:01:29,874 INFO Replicator=Replicator [state=Created, statInfo=, peerId=xxx.251:7848, waitId=-1, type=Follower]@xxx.251:7848 is started

2023-09-23 06:01:29,881 INFO Replicator=Replicator [state=Created, statInfo=, peerId=xxx.249:7848, waitId=-1, type=Follower]@xxx.249:7848 is started

2023-09-23 06:01:29,920 WARN Node <naming_instance_metadata/xxx.250:7848> received invalid RequestVoteResponse from xxx.249:7848, state not in STATE_CANDIDATE but STATE_LEADER.

2023-09-23 06:01:29,955 INFO Node <naming_instance_metadata/xxx.250:7848> received RequestVoteRequest from xxx.249:7848, term=6, currTerm=5.

2023-09-23 06:01:29,957 ERROR Fail to run ConfigurationChangeDone, status: Status[EPERM<1008>: Leader stepped down].

2023-09-23 06:01:29,965 INFO onLeaderStop: status=Status[EHIGHERTERMRESPONSE<10008>: Raft node receives higher term RequestVoteRequest.].

2023-09-23 06:01:29,993 INFO Save raft meta, path=/app/nacos2/20230905/nacos/data/protocol/raft/naming_instance_metadata/meta-data, term=6, votedFor=0.0.0.0:0, cost time=37 ms

2023-09-23 06:01:29,993 INFO Replicator Replicator [state=Probe, statInfo=, peerId=xxx.251:7848, waitId=-1, type=Follower] is going to quit

2023-09-23 06:01:29,996 INFO Replicator Replicator [state=Probe, statInfo=, peerId=xxx.249:7848, waitId=-1, type=Follower] is going to quit

2023-09-23 06:01:29,999 INFO Node <naming_instance_metadata/xxx.250:7848> received PreVoteRequest from xxx.251:7848, term=6, currTerm=6, granted=false, requestLastLogId=LogId [index=9, term=4], lastLogId=LogId [index=10, term=5].

2023-09-23 06:01:30,004 ERROR handleRequest group_id: "naming_instance_metadata" leader_id: "xxxxxx:7848" failed

java.lang.IllegalStateException: Not leader at com.alipay.sofa.jraft.core.NodeImpl.listPeers(NodeImpl.java:2966) at com.alipay.sofa.jraft.rpc.impl.cli.GetPeersRequestProcessor.processRequest0(GetPeersRequestProcessor.java:58) at com.alipay.sofa.jraft.rpc.impl.cli.GetPeersRequestProcessor.processRequest0(GetPeersRequestProcessor.java:33) at com.alipay.sofa.jraft.rpc.impl.cli.BaseCliRequestProcessor.processRequest(BaseCliRequestProcessor.java:119) at com.alipay.sofa.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:53) at com.alipay.sofa.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:35) at com.alipay.sofa.jraft.rpc.impl.GrpcServer.lambda$null$1(GrpcServer.java:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2023-09-23 06:01:30,890 WARN Truncating storage to lastIndexKept=9.

2023-09-23 06:01:31,187 ERROR Fail to truncateSuffix 9 in data path: /home/nacos/data/protocol/raft/naming_instance_metadata/log.

org.rocksdb.RocksDBException: end key comes before start key at org.rocksdb.RocksDB.deleteRange(Native Method) at org.rocksdb.RocksDB.deleteRange(RocksDB.java:1463) at com.alipay.sofa.jraft.storage.impl.RocksDBLogStorage.truncateSuffix(RocksDBLogStorage.java:627) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:537) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:491) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.lang.Thread.run(Thread.java:750) 2023-09-23 06:01:35,797 ERROR Encountered an error=Status[EIO<1014>: Failed operation in LogStorage] on StateMachine com.alibaba.nacos.core.distributed.raft.NacosStateMachine, it's highly recommended to implement this method as raft stops working since some error occurs, you should figure out the cause and repair or remove this node.

com.alipay.sofa.jraft.error.RaftException: ERROR_TYPE_LOG at com.alipay.sofa.jraft.storage.impl.LogManagerImpl.reportError(LogManagerImpl.java:573) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl.access$1200(LogManagerImpl.java:76) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:558) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:491) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.lang.Thread.run(Thread.java:750) 2023-09-23 06:01:35,797 WARN Node <naming_instance_metadata/xxx:7848> got error: {}.

com.alipay.sofa.jraft.error.RaftException: ERROR_TYPE_LOG at com.alipay.sofa.jraft.storage.impl.LogManagerImpl.reportError(LogManagerImpl.java:573) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl.access$1200(LogManagerImpl.java:76) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:558) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:491) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.lang.Thread.run(Thread.java:750) 2023-09-23 06:01:35,798 WARN FSMCaller already in error status, ignore new error.

com.alipay.sofa.jraft.error.RaftException: ERROR_TYPE_LOG at com.alipay.sofa.jraft.storage.impl.LogManagerImpl.reportError(LogManagerImpl.java:573) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl.access$1200(LogManagerImpl.java:76) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:558) at com.alipay.sofa.jraft.storage.impl.LogManagerImpl$StableClosureEventHandler.onEvent(LogManagerImpl.java:491) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.lang.Thread.run(Thread.java:750)

2023-09-23 06:01:35,799 WARN Node <naming_instance_metadata/xxx:7848> is not in active state, currTerm=6.

2023-09-23 06:01:37,411 WARN Node <naming_instance_metadata/xxx:7848> is not in active state, currTerm=6.

2023-09-23 06:01:37,439 WARN Node <naming_instance_metadata/xxx:7848> is not in active state, currTerm=6.

2023-09-23 06:02:28,932 WARN Node <naming_persistent_service/xxx:7848> PreVote to xxx:7848 error: Status[EINTERNAL<1004>: RPC exception:DEADLINE_EXCEEDED: deadline exceeded after 1.011548054s. [closed=[], open=[[buffered_nanos=1943490491, waiting_for_connection]]]].

2023-09-23 06:02:28,988 WARN Node <naming_persistent_service/xxx:7848> PreVote to xxx:7848 error: Status[EINTERNAL<1004>: RPC exception:DEADLINE_EXCEEDED: deadline exceeded after 4.984139122s. [closed=[], open=[[buffered_nanos=2497442626, remote_addr=xxx/xxx:7848]]]].

skippyb1 commented 1 year ago

请教一下 org.rocksdb.RocksDBException: end key comes before start key 这个异常一般是怎么引起的 怎么定位一下

killme2008 commented 11 months ago

有没有做什么特别的操作? 正常来讲,不会有此类问题。

skippyb1 commented 11 months ago

额 这类问题一般是怎么引起的?我看看能不能复现一下 我询问过nacos 说只是对jraft做了版本升级 跟jraft的相关代码没有动过,旧版本没有出现过这种问题

killme2008 commented 11 months ago

我也想不出来什么条件可能出现。 删了 jraft 的 log 数据?