vesoft-inc / nebula

A distributed, fast open-source graph database featuring horizontal scalability and high availability
https://nebula-graph.io
Apache License 2.0
10.68k stars 1.2k forks source link

Restarting the Nebula cluster has no effect. #5864

Open catchdog007 opened 5 months ago

catchdog007 commented 5 months ago

Please check the FAQ documentation before raising an issue

Describe the bug (required)

这个文档里面提到了一个优雅重启的方式:https://docs.nebula-graph.com.cn/3.6.0/k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/ 然鹅我在好几个环境上验证,都是不生效的,包括以下方式:

  1. kubectl annotate sts
  2. kubectl rollout restart

这个我初步排查了下,大概率是operator在搞鬼,因为我在使用helm安装nebula集群时,根本没有配置滚动重启的副本数,当我部署完成后,用kubectl edit sts 发现,那3个服务的配置里面,滚动重启的副本数跟集群中服务的副本数一样,导致重启无效,于是我尝试将里面的滚动重启策略去掉,之后虽然可以滚动重启了,但是使用kubectl get nc nebula查看集群的状态,变成了False,之后通过kubectl edit nc nebula 修改3个服务的config,都无效

然后就是,官方文档中提到的优雅重启,一般我记得kubectl annotate是不会出发pod重启的,可能是operator里面做了什么特殊操作,但是目前来看也是有bug的,因为文档中提到的方式,和滚动重启都是无法正常重启的,目前我能想到的重启方式只有kubectl delete pod这种暴力方式了

Your Environments (required)

使用helm3安装nebula-operator-1.7.6和nebula-cluster-1.7.0,K8s版本1.23.9

How To Reproduce(required)

Steps to reproduce the behavior:

按照我上面的方式搭建集群,尝试按照上面的方式重启,应当是百分百复现的

Expected behavior

Additional context

abby-cyber commented 5 months ago

https://github.com/vesoft-inc/nebula-operator/releases/tag/v1.8.0 Please note that the ability to restart pods is a feature that was introduced in version 1.8.0 of operator