uken / fluent-plugin-elasticsearch

Apache License 2.0
891 stars 310 forks source link

[out_elasticsearch] Keep generating new index template when ILM enabled and using logstash style indices #790

Open scotthsieh17 opened 4 years ago

scotthsieh17 commented 4 years ago

(check apply)

Problem

When I specified following config, we got index rollover correctly, but it will keep generating new index template with format logstash-{date}


    enable_ilm true
    ilm_policy_id logstash_policy
    template_name logstash_template
    template_file /fluentd/etc/ilm/logstash_template.json
    template_overwrite true
    logstash_prefix logstash

If I switch to use following config, it will have ilm exception illegal_argument_exception: index [logstash-default-20200806] is not the write index for alias [logstash]

    rollover_index true
    deflector_alias logstash

I'm not Ruby expert, but when I dig into the code, in https://github.com/uken/fluent-plugin-elasticsearch/blob/9976147cb422caa8b58d7a6118146dd00f2e44f8/lib/fluent/plugin/out_elasticsearch.rb#L247

https://github.com/uken/fluent-plugin-elasticsearch/blob/9976147cb422caa8b58d7a6118146dd00f2e44f8/lib/fluent/plugin/elasticsearch_index_template.rb#L69

https://github.com/uken/fluent-plugin-elasticsearch/blob/9976147cb422caa8b58d7a6118146dd00f2e44f8/lib/fluent/plugin/elasticsearch_index_template.rb#L119

it looks like when logstash_format true and enable_ilm true, it will use index_name as template_name and create new template, I really not very sure if I'm reading it correctly, so looking for help here.

...

Steps to replicate

Manual created index template: logstash_template

{
  "index": {
    "mapping": {
      "total_fields": {
        "limit": "2000000"
      }
    },
    "refresh_interval": "60s",
    "translog": {
      "durability": "async"
    },
    "max_inner_result_window": "1000",
    "query": {
      "default_field": "*"
    },
    "max_result_window": "200000",
    "requests": {
      "cache": {
        "enable": "true"
      }
    },
    "number_of_replicas": "0",
    "lifecycle": {
      "name": "logstash_policy",
      "rollover_alias": "logstash"
    },
    "codec": "best_compression",
    "routing": {
      "allocation": {
        "require": {
          "data": "hot"
        }
      }
    },
    "number_of_shards": "10",
    "max_rescore_window": "200000",
    "max_docvalue_fields_search": "50000"
  }
}

Using fluent.conf above

Expected Behavior or What you need to ask

Reuse same index template when ilm enabled and index rollover

...

Using Fluentd and ES plugin versions

reddare commented 4 years ago

A similar problem. With logstash-{date}, a new template is created every day.

cosmo0920 commented 4 years ago

This is because logstash indices are calculated from here: https://github.com/uken/fluent-plugin-elasticsearch/blob/master/lib/fluent/plugin/out_elasticsearch.rb#L848

And these calculated values should not be analytic solution from original logstash_prefix, logstash_dateformat and some logstash related parameters as far as I know. They are calculated at runtime. So, current implementation is pessimistic implementation.

If any more suitable solution or patch for this, please send your patch, thanks!!

cosmo0920 commented 4 years ago

Just simply using <logstash_prefix>-<application_name>-* for index_patterns is not solution. Because template also should know rollover_alias association for original indices.

scotthsieh17 commented 4 years ago

Is it possible to use "template_name" we gave in fluentd.conf when rolling new index?

cosmo0920 commented 4 years ago

Is it possible to use "template_name" we gave in fluentd.conf when rolling new index?

No.

maxisam commented 3 years ago

Is it possible if I remove rollover from ILM, it can bypass this issue? Currently it will still create rollover alias even I disable rollover. To be honest, this basically make template and life cycle useless from config.

cosmo0920 commented 3 years ago

For logstash index template with ILM usecase should be replaced with out_elasticsearch_data_stream. The ElasticsearchDataStream Output plugin reference is here: https://github.com/uken/fluent-plugin-elasticsearch#configuration---elasticsearch-output-data-stream

cosmo0920 commented 3 years ago

Elasticsearch's data stream is also managed with ILM and index_templates: https://github.com/uken/fluent-plugin-elasticsearch/blob/master/lib/fluent/plugin/out_elasticsearch_data_stream.rb#L39-L41

cosmo0920 commented 3 years ago

Elasticsearch's data stream is also managed with ILM and index_templates: https://github.com/uken/fluent-plugin-elasticsearch/blob/master/lib/fluent/plugin/out_elasticsearch_data_stream.rb#L39-L41

Currently. the ElasticsearchDataStream Output plugin doesn't handle user defined index_template. That is not matured plugin. Your feedback would be much appreciated.

maxisam commented 3 years ago

@cosmo0920 thanks for the quick response. Don't get me wrong your work is awesome! And thanks for pointing the right direction

cosmo0920 commented 3 years ago

And I guess that logstash format indices and enabling ILM on it should be troublesome combination. Thanks for pinging again this issue. :smile: