alibaba / nacos

an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications.
https://nacos.io
Apache License 2.0
30.29k stars 12.84k forks source link

三台组成一个nacos集群,一个nacos节点宕机,然后kill掉服务提供者进程,nacos控制台服务列表上还能看到 #4626

Closed yylstudy closed 3 years ago

yylstudy commented 3 years ago

部署架构:Nacos Server用集群方式部署,域名->单节点Ngnix->3节点Nacos 版本: nacos-server-1.4.0 nacos-discovery-spring-boot-starter-0.2.7 dubbo-spring-boot-starter-2.7.8

操作: 1)在集群成功注册一个dubbo服务,并能在nacos控制台的服务列表中查询到 2)kill掉一个nacos集群节点服务 3)接着kill掉注册的dubbo服务,但是nacos控制台的服务列表中还显示对应的服务,且健康实例数还是1 4)启动刚才kill掉的nacos节点后,过一会nacos控制台的服务列表中服务被正常删除 5)nacos.discovery.register.ephemeral=false 设置为持久化服务,重新执行一遍上面的操作,结果也是一样的 这个是正常情况吗?但是设置成持久化服务,也就是CP模式,为什么也是同样的情况呢? 麻烦解答下,谢谢。

zwkwd2008 commented 3 years ago

不正常 bug 出现原因:

MemberUtils

public static void onFail(Member member, Throwable ex) {
    Member cloneMember = new Member();
    copy(member, cloneMember);
    manager.getMemberAddressInfos().remove(member.getAddress());
    cloneMember.setState(NodeState.SUSPICIOUS);
    cloneMember.setFailAccessCnt(member.getFailAccessCnt() + 1);
    int maxFailAccessCnt = ApplicationUtils.getProperty("nacos.core.member.fail-access-cnt", Integer.class, 3);

    // If the number of consecutive failures to access the target node reaches
    // a maximum, or the link request is rejected, the state is directly down
    if (cloneMember.getFailAccessCnt() > maxFailAccessCnt || StringUtils
            .containsIgnoreCase(ex.getMessage(), TARGET_MEMBER_CONNECT_REFUSE_ERRMSG)) {
        cloneMember.setState(NodeState.DOWN);
    }
    manager.update(cloneMember);
}

update 中 用到的copy 代码如下 public static void copy(Member newMember, Member oldMember) { oldMember.setIp(newMember.getIp()); oldMember.setPort(newMember.getPort()); oldMember.setState(newMember.getState()); oldMember.setExtendInfo(newMember.getExtendInfo()); oldMember.setAddress(newMember.getAddress()); }

并没有回写cloneMember.setFailAccessCnt(member.getFailAccessCnt() + 1); 这个值

yylstudy commented 3 years ago

这个是nacos-server的bug吗,2.x版本有这个bug吗,谢谢

KomachiSion commented 3 years ago

这个 bug会在1.4.1中修复。

yylstudy commented 3 years ago

好的,非常感谢