opensearch-project / opensearch-migrations

Migrate, upgrade, compare, and replicate OpenSearch clusters with ease.
https://aws.amazon.com/solutions/implementations/migration-assistant-for-amazon-opensearch-service/
Apache License 2.0
39 stars 28 forks source link

Auto-scaling Up and Down for Live Capture and Backfill #1072

Open sumobrian opened 1 month ago

sumobrian commented 1 month ago

Is your feature request related to a problem?

Currently, OpenSearch-Migrations does not natively support autoscaling for key components like the proxy, replayer, and RFS. As usage increases, particularly in large migrations, the ability to handle increased load efficiently and dynamically is crucial. Autoscaling is a common requirement for enterprise workloads, and the lack of this feature may hinder community adoption and limit flexibility for developers.

What solution would you like?

We would like to introduce autoscaling support for the proxy, replayer, and RFS components in the OpenSearch-Migrations project. This feature will allow these components to scale dynamically based on load, ensuring better performance, resource utilization, and developer efficiency. One of the core tenets of this repository is to support local deployments to ensure community adoption, flexibility, and enhanced developer productivity.

To achieve this, we propose implementing this feature using Kubernetes, which naturally aligns with our existing containerized solution. Kubernetes will manage the scaling of these components in response to demand, providing a more flexible, scalable, and modern solution for users performing migrations at various scales.

What alternatives have you considered?

  1. Manual Scaling: Manually adjusting resources based on load. This is inefficient and error-prone, particularly for larger-scale migrations or environments that require consistent performance. In particular, we find that we have to provide clear guidance on how to scale reindex-from-snapshot based on the source clusters sharding scheme and the size of the target cluster.

  2. Custom Scripts: Writing custom scripts to handle scaling of individual components. While this could work, it introduces complexity and potential maintenance overhead that could be avoided by leveraging Kubernetes' built-in autoscaling features.

Do you have any additional context?

Autoscaling will ensure that the migration tools can handle increased traffic, particularly for larger migrations where demand may fluctuate. In additional, this will remove a lot of cognitive load from the user who has to understand how to scale up and scale down to perform a migration or replicate data as efficiently and quickly as possible. This feature will not only make the tool more attractive for enterprise use cases but also improve the developer experience by providing flexibility for testing and local development. Kubernetes is a natural evolution from our current containerized setup, allowing us to seamlessly integrate this feature while keeping the project modern and scalable.