opensearch-project / helm-charts

:wheel_of_dharma: A community repository for Helm Charts of OpenSearch Project.
https://opensearch.org/docs/latest/opensearch/install/helm/
Apache License 2.0
170 stars 228 forks source link

[BUG] [OpenSearch-2.20.0] [ Scaling Challenges With Statefulsets ] #550

Open 1d9akash opened 3 months ago

1d9akash commented 3 months ago

Describe the bug

When deploying OpenSearch using a Helm chart with separate nodeGroup configurations for "master" and "data" roles (each with 2 replicas), the deployment process requires manual creation of Persistent Volumes (PVs) to match Persistent Volume Claims (PVCs) generated by the Helm chart. This setup uses Amazon EFS as the storage backend, with a pre-existing StorageClass in the cluster. The manual step of creating PVs with specific labels to ensure proper PVC binding for the stateful sets (os-master-0, os-master-1, os-data-0, os-data-1) introduces complexity, especially when scaling the deployment up or down. I am seeking a solution to simplify the scaling process, either by allowing all replicas to share a single volume or by automating PV and PVC creation and binding during scaling operations.

To Reproduce

Steps to reproduce the behavior are not applicable as the issue relates to deployment and scaling infrastructure setup.

Expected behavior

The expected solution would automate the volume management process, eliminating the need for manual PV creation when scaling the OpenSearch deployment. Ideally, scaling up or down would automatically handle PV and PVC provisioning and binding, simplifying the management of stateful sets in Kubernetes.

Chart Name

OpenSearch Helm Chart Version: 20.04

Host/Environment

Additional context

The current setup uses Amazon EFS as the persistent storage solution, with a StorageClass already defined in the Kubernetes cluster. The manual process of creating and labeling PVs to match PVCs is hectic, especially for dynamic scaling scenarios. Any guidance on automating this process or configuring the deployment to use shared volumes (if feasible) would be greatly appreciated.

dblock commented 3 months ago

Thanks for opening this.

[Catch All Triage - Attendees 1, 2, 3, 4, 5]