opensearch-project / opensearch-k8s-operator

OpenSearch Kubernetes Operator
Apache License 2.0
366 stars 192 forks source link

[BUG] Status does not reflect status of generated workloads #780

Open siegenthalerroger opened 2 months ago

siegenthalerroger commented 2 months ago

What is the bug?

The operator did not reliably inform me, that the Opensearch Cluster was unhealthy. Despite the OS statefulset being misconfigured, resulting in it never becoming ready the OpenSearchCluster CR never reflected this state. As I was monitoring the Pods directly for issues I never realised the issue was in fact with the statefulset config (misconfiguration caused by myself).

How can one reproduce the bug?

  1. Create an OS CR with a misconfiguration of the statefulset, e.g. accessMode: ReadWriteOnce instead of accessModes: [ReadWriteOnce].
  2. Deploy the CR
  3. Watch the bootstrap, dashboard and initJob Pods start, log errors and restart endlessly.
  4. Check the Status of the CR, where there is no indication of the fault or it's origin.

What is the expected behavior?

I would expect the status of the OS CR to reflect the error state of the dependent statefulset. I don't expect the operator to sanity check the statefulset configuration, but surfacing the errors of generated resources on the CR would be very beneficial.

What is your host/environment?

Azure Linux, AKS, OS Operator 2.5.1

dblock commented 1 week ago

Catch All Triage - 1 2 3 4 5 6