vesoft-inc / nebula-operator

Operation utilities for Nebula Graph
https://vesoft-inc.github.io/nebula-operator
Apache License 2.0
81 stars 30 forks source link

after restart a single storage pod, lead not balanced #444

Closed jinyingsunny closed 6 months ago

jinyingsunny commented 9 months ago

after restart a single storage pod, operator do balance leader, but leader still not balanced.

image

check operator log, may be the action too early,from ready to restart storaged pod to do balance leader,only less than 500ms's interval. image

since we know, do balance leader may not archive the aim for once, may be we should do some check and repeat.

Your Environments (required)

operator镜像:reg.vesoft-inc.com/cloud-dev/nebula-operator:snap-1.35

How To Reproduce(required)

Steps to reproduce the behavior:

1. with 3 zones and each zone with 3 storaged;
2. create 2 space;
3. restart one storaged pod `kubectl -n nebula annotate sts nebulav-storaged nebula-graph.io/restart-ordinal="8"`
4. pay attention to operator log and check storaged leader distribution. eg: show hosts.

Expected behavior after restart , leader keep balanced

jinyingsunny commented 9 months ago

补充:当3个zone,每个zone分别只有1个sotraged时,重启了一个storaged后,虽然在两个space中都完成了 balance leader,但最终结果,依旧是不均匀。 image

相关balance操作的日志: image

jinyingsunny commented 9 months ago

checked with snap-1.37, still has the problem

jinyingsunny commented 9 months ago

discues offline: remind user to do balance data by hand;later do optimize.

jinyingsunny commented 9 months ago

@abby-cyber

MegaByte875 commented 6 months ago

https://github.com/vesoft-inc/nebula-operator/pull/507