opensearch-project / opensearch-k8s-operator

OpenSearch Kubernetes Operator
Apache License 2.0
384 stars 201 forks source link

OpenSearch K8s Operator not restarting the bootstarp, coordinators, nodes and master pods after upgrade the version #849

Open rameshar16 opened 3 months ago

rameshar16 commented 3 months ago

What is the bug?

OpenSearch K8s Operator not restarting the bootstarp, coordinators, nodes and master pods after upgrade the version.

How can one reproduce the bug?

Change the version from 2.4.0 to 2.6.0 and run the helm upgrade to deploy the new version.

What is the expected behavior?

OpenSearch K8s Operator should restart the bootstarp, coordinators, nodes and master pods after upgrade the version.

What is your host/environment?

AWS EKS cluster and deployed OpenSearch K8s Operator. Deployed OpenSearch cluster using helm chart.

Do you have any screenshots?

No

Do you have any additional context?

No

rameshar16 commented 3 months ago

`>helm diff upgrade opensearch-cr ./charts/opensearch-cluster/ --values ./charts/opensearch-cluster/values.yaml -n opensearch apiVersion: opensearch.opster.io/v1 kind: OpenSearchCluster metadata: name: opensearch-cr namespace: opensearch spec: general: **- version: 2.4.0

rameshar16 commented 3 months ago
>k get po -n opensearch -w
NAME                                                      READY   STATUS    RESTARTS   AGE
opensearch-cr-coordinators-0                              1/1     Running   0          17h
opensearch-cr-coordinators-1                              1/1     Running   0          16h
opensearch-cr-coordinators-2                              1/1     Running   0          16h
opensearch-cr-dashboards-54d6ccb67c-nw9sp                 1/1     Running   0          17h
opensearch-cr-masters-0                                   1/1     Running   0          15m
opensearch-cr-masters-1                                   1/1     Running   0          16h
opensearch-cr-masters-2                                   1/1     Running   0          16h
opensearch-cr-nodes-0                                     1/1     Running   0          26m
opensearch-cr-nodes-1                                     1/1     Running   0          16h
opensearch-cr-nodes-2                                     1/1     Running   0          16h
opensearch-operator-controller-manager-68f76ffd94-knsw4   2/2     Running   0          13m
prudhvigodithi commented 3 months ago

[Triage] Adding @swoehrl-mw to please verify this, I assume we have seen similar issue in past and was fixed with PR https://github.com/opensearch-project/opensearch-k8s-operator/pull/789 right? Thank you

prudhvigodithi commented 3 months ago

Also @rameshar16 I assume you are using the latest version of the operator, can you please confirm ? Thank you

swoehrl-mw commented 2 months ago

@prudhvigodithi This does not look like the parallel recovery bug.

@rameshar16 Please check the OpenSearchCluster CR status (kubectl describe opensearchcluster opensearch-cr) what the operator is reporting. Also please verify if your opensearch cluster is healthy (green status). Because it looks like the first pod of each pool was restarted but the operator is not continuing. And that points towards the cluster not being green.