elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.55k stars 24.62k forks source link

Lazy data stream rollover is not triggered when using reroute #112781

Open axw opened 6 days ago

axw commented 6 days ago

Elasticsearch Version

8.15.1

Installed Plugins

No response

Java Version

bundled

OS Version

N/A

Problem Description

Lazy rollover on a data stream is not triggered when writing a document that is rerouted to another data stream. This affects the apm-data plugin, where we perform a lazy rollover of matching data stream patterns when installing or updating index templates. The data stream never rolls over. See https://github.com/elastic/apm-server/issues/14060#issuecomment-2344837717

Should a write that leads to a reroute also trigger the lazy rollover? I think so, otherwise the default pipeline will not change.

Steps to Reproduce

  1. Create an index template which sets a default ingest pipeline with reroute
PUT /_ingest/pipeline/demo-reroute
{
  "processors": [
    {
      "reroute": {"namespace": "foo"}
    }
  ]
}

PUT /_index_template/demo_1
{
  "index_patterns" : ["demo*"],
  "data_stream": {}, 
  "priority" : 1,
  "template": {
    "settings" : {
      "number_of_shards": 1,
      "index.default_pipeline": "demo-reroute"
    }
  }
}
  1. Create a data stream matching the index template
PUT /_data_stream/demo-dataset-default
  1. Send a document to the data stream; it will be rerouted
POST /demo-dataset-default/_doc
{
  "@timestamp": "2024-09-12"
}

{
  "_index": ".ds-demo-dataset-foo-2024.09.12-000001",
  "_id": "z2Ab5JEBCHevSrCVP7aG",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}
  1. Create another index template with higher priority with the same index pattern, with no default ingest pipeline
PUT /_index_template/demo_2
{
  "index_patterns" : ["demo*"],
  "data_stream": {}, 
  "priority" : 2
}
  1. Rollover the source data stream with "lazy=true"
POST /demo-dataset-default/_rollover?lazy=true
  1. Send a document to the data stream; it will still be rerouted
POST /demo-dataset-default/_doc
{
  "@timestamp": "2024-09-12"
}

{
  "_index": ".ds-demo-dataset-foo-2024.09.12-000001",
  "_id": "x2gc5JEBfAEizTaQVStE",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}
  1. Rollover the source data stream with "lazy=false"
POST /demo-dataset-default/_rollover?lazy=false
  1. Send a document to the data stream; it will not be rerouted
POST /demo-dataset-default/_doc
{
  "@timestamp": "2024-09-12"
}

{
  "_index": ".ds-demo-dataset-default-2024.09.12-000002",
  "_id": "1mAf5JEBCHevSrCVc7YV",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

Logs (if relevant)

No response

elasticsearchmachine commented 4 days ago

Pinging @elastic/es-data-management (Team:Data Management)