dhiaayachi / temporal

Temporal service
https://docs.temporal.io
MIT License
0 stars 0 forks source link

Add support to create Elasticsearch rolling indexes in visibility #241

Open dhiaayachi opened 2 months ago

dhiaayachi commented 2 months ago

I would like to be able to keep an Elasticsearch cluster performant, manageable, and affordable by having the ES visibility storage provider roll to a new indexname periodically. ie per month index names might be {namespace}-202301, {namespace}-202302, etc, or per year {namespace}-2023, {namespace}-2024. A configurable user defined pattern that allows custom field values as target indexnames would also be a nice add. There should then be a means to search across a range of indexes (with the default range being configurable, or all?)

This would allow searches to target the indexes by date range (or custom field etc) thereby being much more efficient. This would also allow automatic ES migration of older content to cheaper ES tiers, or archiving them.

Describe alternatives you've considered I am not aware of any real alternative.

dhiaayachi commented 1 month ago

Thank you for your feature request. This is a great idea and it would be helpful for users to be able to periodically roll over visibility storage to new indexes in Elasticsearch. This would help to keep the cluster performant, manageable, and affordable. You could create custom Search Attributes based on a date field and query by date range, but this might be inefficient for large workloads.

We have a feature request open for this: https://github.com/temporalio/temporal/issues/4047. We appreciate your feedback and will consider adding this feature in the future.

dhiaayachi commented 1 month ago

Thank you for the feature request! While we don't currently support automatically rolling over Elasticsearch indices, you can use Dual Visibility with Elasticsearch to create and manage multiple indices, giving you the desired level of control.

Here's a possible approach:

  1. Set up a primary Elasticsearch index: This will be your initial index for storing Workflow Executions.
  2. Create additional secondary indices: Configure your Elasticsearch cluster to create new indices based on your desired roll-over criteria (e.g., monthly, yearly).
  3. Utilize Dual Visibility: Use Dual Visibility to write to both the primary and secondary indices.
  4. Control the flow of data: Use your Temporal configuration to control how data is written to the secondary index.
  5. Manage index lifecycle: Manually roll over the primary index to a new secondary index once the desired retention period is reached.

You can also use the Elasticsearch API to perform automated tasks. This would allow you to roll over indices and manage their lifecycle more effectively.

We appreciate your feedback and will consider adding the capability to automatically roll over indices in a future release. Please let us know if you have any other questions.

dhiaayachi commented 1 month ago

Thank you for your feature request. This is an interesting idea that could greatly improve the performance, manageability, and affordability of Elasticsearch clusters used as Visibility stores.

While this functionality is not currently available, you can use the temporal operator search-attribute command to create custom Search Attributes that include date information in their name. This allows you to use Temporal's List API to filter Workflow Executions based on date ranges. For example, you could create custom Search Attributes called workflows-202301, workflows-202302, etc., and use the List API to retrieve Workflow Executions that match a specific date range. You can further refine the search using StartTimeFilter and EndTimeFilter to achieve date-based filtering.

This workaround may not be as efficient as having Temporal automatically roll indexes, but it does provide a way to filter Workflow Executions based on date. For more information on using the List API and custom Search Attributes, please refer to the Visibility documentation.

dhiaayachi commented 1 month ago

Thank you for the feature request!

Currently, Temporal doesn't support rolling indexes in the Elasticsearch visibility store, but it is possible to achieve a similar effect by configuring a new Elasticsearch index for each desired time period (e.g., per month, per year) and then using the Temporal List API to query across these indexes.

As an example, you could create a dedicated index for each month using a naming convention like temporal_visibility_{namespace}-{year}{month} and then modify the Elasticsearch URL in the Temporal configuration to point to the specific index when querying.

This would allow you to maintain and search data across a range of indexes.

Let us know if you have any further questions.