opensearch-project / index-management

🗃 Automate periodic data operations, such as deleting indices at a certain age or performing a rollover at a certain size
https://opensearch.org/docs/latest/im-plugin/index/
Apache License 2.0
53 stars 112 forks source link

[FEATURE] Add `min_primary_shard_doc_count` rollover parameter #1124

Open Jakob3xD opened 8 months ago

Jakob3xD commented 8 months ago

Is your feature request related to a problem? Most of the time I try to have a size per shard of 50GB. Unfortunately for some tools like jaeger this is an issue as it pushes a lot of small documents, reaching the lucene hard limit document count. This results in an index where no more data can be pushed.

What solution would you like? I would like to have the parameter min_primary_shard_doc_count for rollover, to prevent indices from running into the lucene hard limit. For example and index has 3 shards and min_primary_shard_doc_count is set to 2_000_000_000. If on of the primary shard reaches the value the index gets rolled over.

What alternatives have you considered? The other alternative would be to create a policy where min_primary_shard_size is lower so it roles over before the doc limit is reached as min_doc_count does not scale with shards.

Do you have any additional context? Personally I would use this option in combination with min_primary_shard_size, so I never run in the issue of having a to large shard and to have the option to scale shards as needed without the need of creating a policy per index.

dblock commented 5 months ago

Catch All Triage - 1 2 3 4 5

Jakob3xD commented 2 months ago

@dblock can we move the issue to opensearch-project/OpenSearch as the requests is actually a feature requests for the /_rollover api that later on would need to be supported by ISM. Don't know if it needs a new triage afterwards.

Still don't know how nobody else does not have the issue or are they all using a work around by creating policies for each shard count?