opensearch-project / index-management

🗃 Automate periodic data operations, such as deleting indices at a certain age or performing a rollover at a certain size
https://opensearch.org/docs/latest/im-plugin/index/
Apache License 2.0
53 stars 112 forks source link

[Feature Request] Shrink action must be work as Rollover action with data stream index #1201

Open disaster37 opened 4 months ago

disaster37 commented 4 months ago

Is your feature request related to a problem? Please describe

When we use Opensearch stack to ingest logs, it's normal to use data stream index. More over, It's normal on this use case to have hot / warm / delete architecture. We keep 24h of logs on hot node, then we keep them in long time (30d) on warm node.

Because of logs are not always searched on warm node, we shrink index to avoid ta have to many shard on warm nodes. So on ISM (index state management) policy we have shrink step on warm phase. But this step not working as expected with data stream index. When it create new index (ths shrunken index), it break the lineage with data stream index. The new index is no more part of data stream index. And it break all the next step of ISM because of the new index is outdoor of the current policy.

Describe the solution you'd like

We expect that this ISM policy work out of the box.

{
    "id": "policy-log",
    "policy": {
        "policy_id": "policy-log",
        "description": "Policy for logs index",
        "default_state": "hot",
        "states": [
            {
                "name": "hot",
                "actions": [
                    {
                        "rollover": {
                            "min_index_age": "1d",
                            "min_primary_shard_size": "50gb",
                            "copy_alias": false
                        }
                    }
                ],
                "transitions": [
                    {
                        "state_name": "warm"
                    }
                ]
            },
            {
                "name": "warm",
                "actions": [
                    {
                        "allocation": {
                            "require": {
                                "temp": "warm"
                            },
                            "include": {},
                            "exclude": {},
                            "wait_for": false
                        }
                    },
                    {
                        "index_priority": {
                            "priority": 50
                        }
                    },
                    {
                        "shrink": {
                            "percentage_of_source_shards": 0.5
                        }
                    },
                    {
                        "force_merge": {
                            "max_num_segments": 1
                        }
                    }
                ],
                "transitions": [
                    {
                        "state_name": "delete",
                        "conditions": {
                            "min_index_age": "30d"
                        }
                    }
                ]
            },
            {
                "name": "delete",
                "actions": [
                    {
                        "delete": {}
                    }
                ]
            }
        ],
        "ism_template": [
            {
                "index_patterns": [
                    "logs-*"
                ],
                "priority": 100
            }
        ]
    }
}

As human understanding, we should it work like this:

It can be usefull to orchestrate the steps from phase to the best order. Maybee the best order is not the order provided by user. Maybee it's better to start with force merge and finish by allocation ... Or maybee force merge need to be run after shrink index and allocate it on the right node ?

Related component

Other

Describe alternatives you've considered

No response

Additional context

I think shrink step must be work like rollover step with data stream index. After look a little bit the code, I think the problem is not on ISM plugin, but on Opensearch core. Especially on server/src/main/java/org/opensearch/action/admin/indices/shrink package.

andrross commented 4 months ago

[Triage - attendees 1 2 3 4 5]

Thanks for filing. This looks like a reasonable request. We will need the ability to add indexes to a data stream as a prerequisite (see opensearch-project/OpenSearch#8271).

vikasvb90 commented 4 months ago

@opensearch-project/admin Please transfer this issue to https://github.com/opensearch-project/index-management/ repo.

dblock commented 3 months ago

[Catch All Triage - 1, 2]

disaster37 commented 1 month ago

I have opened on another issue on main opensearch project because of the main part of issue seems to be on Opensearch core project.

https://github.com/opensearch-project/OpenSearch/issues/16063