Closed bbdshow closed 1 year ago
@bbdshow Thanks for your awesome contribution.
There are some suggestions to make this PR better.
Heartbeat
interface in MasterService
instead of reusing Register
interface to make the semantic clear.OnUnregister
callback name should rename to UnregisterCallback
to make its name clear.Again, thanks very much for your pull request to make nano better.
@lonng Thank you very much for your suggestion. I have modified it and submitted it
@bbdshow Another three commented left, please address them and the rest looks good to me.
What problem does this PR solve?
1.Master与Members之间通过RPC调动建立心跳。让Master有主动踢掉异常节点的能力。 2.通过节点上报心跳信息,可以让Master重启或更新后,第一时间知道注册信息,其他节点无需通过重启再次注册。 3.最终保持 集群健康,和解耦 节点间的发布顺序与关联。方便多样式部署。 4.hook一个 OnUnregister fn,当节点异常做一些处理,比如报警。
What is changed and how it works?
1.通过在 Register proto + IsHeartbeat 字段,走不同的注册逻辑。记录心跳时间,每次 心跳注册 时检查一下所有节点的上报时间。 2.普通member定时向master调用心跳注册。
关于测试: 启动 1 master 2Gate 3 chat,
1chat可以开一个 mock exception exit 的方法。此时 master 会有主动 剔除异常 chat 的行为,具体可看日志。 2.当master重启,模拟master更新操作, 其他节点不用重启, master 的集群信息会自动恢复