Open alantang888 opened 8 months ago
Thank you for your contribution. Sorry for replying late but any chance you can provide logs for this issue or a minimalistic reproducer (because trying this to reproduce this in AWS might be hard)?
cc @idanl21 @dbason @swoehrl-mw @prudhvigodithi @jochenkressin @pchmielnik
I just build a lab environment. After cluster is green and all node are running. I modify spec.general.additionalConfig
(This case I modify indices.query.bool.max_clause_count
) to trigger rolling restart on cluster.
I start capture log before apply change. Then stop when all nodes restarted and cluster return to green. opensearch-operator.txt
What is the bug?
I have to allocation awareness on different AWS AZ. Then I set 3 data node pools with their AZ name to
node.attr.az
. eg: node pooldata-a
withus-east-1a
. node pooldata-b
withus-east-1b
. node pooldata-d
withus-east-1d
...When I change some cluster config. All data node pool restart at the same time. It cause cluster status turn red.
How can one reproduce the bug?
Have a cluster have 3 data node pools. Each on one AZ and have 1 replica. And set
node.attr.
. When cluster status is green. Make change to cluster trigger rolling restartWhat is the expected behavior?
On docs mentioned
The Operator will then perform a rolling upgrade and restart the nodes one-by-one, waiting after each node for the cluster to stabilize and have a green cluster status.
. So I expect those node pool should restart one by one.eg:data-a
triggered restart. Then cluster status turn yellow. When cluster back to green. Then it restart other pod ondata-a
(if any). After that, it restartdata-b
, and repeat until all data node pool is restarted. (Restart order of node pool is not important)What is your host/environment?
EKS 1.28 Opensearch Operator 2.5.1 Opensearch 2.8.0
Do you have any screenshots?
Do you have any additional context?
This is my cluster config. (removed dashboard, which should not related)