opensearch-project / opensearch-migrations

Migrate, upgrade, compare, and replicate OpenSearch clusters with ease.
https://aws.amazon.com/solutions/implementations/migration-assistant-for-amazon-opensearch-service/
Apache License 2.0
39 stars 28 forks source link

[FEATURE] - Transformation framework to flexibly support migration and replication paths #1073

Closed sumobrian closed 1 month ago

sumobrian commented 1 month ago

Is your feature request related to a problem?

In complex migration paths, such as migrating from Elasticsearch 6.8 to OpenSearch 2.x, certain features (like multiple types per index) that were supported in older Elasticsearch versions are no longer supported in OpenSearch 2.x. This requires transformations to migrate data in a compatible format. While we provide a transformation that splits types into separate indices, users may want to perform their own custom transformations to handle this migration. The same applies to metadata migrations, where users need to implement, test, and update workflows when making changes. Currently, there is no easy way to support or configure these transformations. This feature will also apply to post-fork versions of Elastic and other data stores/engines.

What solution would you like?

We would like to implement a transformation framework that supports user-defined transformations across live capture, metadata migrations, and existing data migrations. This framework would allow users to perform migration hops that were previously not possible. For example, in the Elasticsearch 6.8 to OpenSearch 2.x migration path, users might want to apply custom transformations to handle the conversion of multiple types per index into an OpenSearch 2.x-compatible format. Additionally, the framework should support JSON transformation functionality, allowing users to configure metadata migrations and add them to a suite of transformations for reuse.

What alternatives have you considered?

While we provide an out-of-the-box transformation for splitting types into separate indices, this may not cover all user needs. Users may want more control over the transformation process, particularly for specific use cases, or to handle metadata migration workflows more flexibly. Without a configurable transformation framework, users would have to manually implement these transformations, increasing complexity and effort.

Do you have any additional context?

One example that demonstrates the value of this framework is supporting a migration from Elasticsearch 6.8 to OpenSearch 2.x. Elasticsearch 6.x can have indices created in Elasticsearch 5.x, which support multiple types per index. For a user performing this migration path, they need to transform this data into a format compatible with OpenSearch 2.x. While we provide a transformation that splits types into separate indices, a user might prefer to create their own transformation. We also find metadata migrations particularly valuable, where users repeatedly implement, test, and update workflows. This framework would allow for customizable JSON transformations, which users can add to a broader suite of transformations to streamline their migration process.

sumobrian commented 1 month ago

Closing and replacing with #1090