sofastack / sofa-jraft

A production-grade java implementation of RAFT consensus algorithm.
https://www.sofastack.tech/projects/sofa-jraft/
Apache License 2.0
3.59k stars 1.15k forks source link

nacos 2.2 集群扩容后无法从raft member移除 #1146

Closed alexwylp closed 2 months ago

alexwylp commented 2 months ago

各位大佬,好。我使用的是nacos 2.2.3,原先有个稳定的集群:节点1、节点2、节点3。近期准备新搭建一个集群:节点4,选择从原先某个节点(节点1)打包复制一份nacos包搭建。 在启动节点4时,因疏忽,忘了将cluster.conf改为节点4(仍为节点1、节点2、节点3),启动后,原集群raft member列表中,就出现节点1、节点2、节点3、节点4(界面中集群节点数正常,是3个,只是raft member中为4个),即使关闭节点4,也无法清除。关闭节点4后,jraft log报如下warning: Channel [节点4:7848] in [inactive] state 2 times, it has been removed from the pool. This channel 节点4:7848 has started shutting down. Any new RPCs should fail immediately. There has been some transient failure on this channel 节点4:7848

后对原集群进行一台一台滚动重启,重启前删除data目录以期望清理缓存,但仍无济于事。应用侧使用均正常,只是jraft log不停报上述告警。 想请教下,如何可以彻底解决此问题?难道是需要原集群全部停机,同时删除data后再重启? 谢谢!

fengjiachun commented 2 months ago

如果理解正确你的问题,现在只是想从 raft group 中移除节点4? 使用 cli service 移除掉就可以了,参考文档 4.2 小结:

https://www.sofastack.tech/projects/sofa-jraft/jraft-user-guide/

alexwylp commented 2 months ago

理解正确,我去试试,谢谢