alibaba / nacos

an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications.
https://nacos.io
Apache License 2.0
30.01k stars 12.8k forks source link

naocs集群三节点部分节点宕机后重启节点数据不一致 #9836

Closed mroldx closed 1 year ago

mroldx commented 1 year ago

Describe what happened (or what feature you want)

版本:nacos-server:2.0.4 部署方式:docker host网络模式 复现过程: 三节点 nacos1:8848 nacos2:8848 nacos3:8848

nacos2,nacos3节点由于部分原因直接被杀死进程,随后重启。各个节点的服务注册数及dubbo服务注册数不一致 nacos1由于没被杀服务都是完整的 2,3的容器日志中存在大量节点选举错误,且貌似被降了级一直请求7848端口 2,3容器 protocol-raft.log java.lang.IllegalStateException: Fail to get leader of group naming_service_metadata at com.alipay.sofa.jraft.core.CliServiceImpl.getPeers(CliServiceImpl.java:631) at com.alipay.sofa.jraft.core.CliServiceImpl.getPeers(CliServiceImpl.java:524) at com.alibaba.nacos.core.distributed.raft.JRaftServer.registerSelfToCluster(JRaftServer.java:361) at com.alibaba.nacos.core.distributed.raft.JRaftServer.lambda$createMultiRaftGroup$0(JRaftServer.java:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2023-01-02 01:23:01,376 ERROR Fail to refresh leader for group : naming_persistent_service, status is : Status[UNKNOWN<-1>: Fail to init channel to 10.19.23.114:7848, Fail to init channel to 10.19.11.217:7848, Fail to init channel to 10.19.11.215:7848]

2023-01-02 01:23:01,376 ERROR Fail to refresh route configuration for group : naming_persistent_service, status is : Status[UNKNOWN<-1>: Fail to get leader of group naming_persistent_service]

2023-01-02 01:23:01,713 ERROR Failed to join the cluster, retry...

java.lang.IllegalStateException: Fail to get leader of group naming_persistent_service_v2

naming-server.log The Raft Group [naming_persistent_service] did not find the Leader node at com.alibaba.nacos.naming.consistency.persistent.impl.PersistentServiceProcessor.remove(PersistentServiceProcessor.java:121) at com.alibaba.nacos.naming.consistency.persistent.PersistentConsistencyServiceDelegateImpl.remove(PersistentConsistencyServiceDelegateImpl.java:69)

nacos-server节点列表元数据 image

wzshuang commented 1 month ago

@mroldx 请问下这个是bug吗