uken / fluent-plugin-elasticsearch

Apache License 2.0
892 stars 309 forks source link

Issues using data streams #869

Open snemetz opened 3 years ago

snemetz commented 3 years ago

Problem

Trying to use ES data streams and failing

Steps to replicate

Fluentd.confg

<match **>
  @type elasticsearch_data_stream
  data_stream_name testing-data-stream
   # rest of settings to connect to ES
</match

Expected Behavior or What you need to ask

Expected to send data to ES data stream Template exists on ES

1) got error that xpack is needed Doc needs to be updated to specify elasticsearch-xpack needs to be installed

2) after elasticsearch-xpack installed Received error: [error]: config error file="fluent.conf" error_class=Fluent::ConfigError error="Failed to create data stream: <testing-data-stream> undefined methodput_index_template' for #\nDid you mean? put_template" ` I have no settings in fluentd for managing template. Expected to just use template already created on ES server

...

Using Fluentd and ES plugin versions

kenhys commented 3 years ago

Ah, this is because API is changed.

ref. https://github.com/elastic/elasticsearch-ruby/blob/master/CHANGELOG.md#7110

elasticsearch-api should be loaded instead.

As a workaround, use elasticsearch-api elasticseach-xpack 7.10.x

snemetz commented 3 years ago

I uninstalled all elasticsearch gems and then installed 7.10.1 versions. It is working now.

1 other question We use type elasticsearch_dynamic. Is there a dynamic version of elasticsearch_data_stream? If not, will be happen to create a feature request for it

kenhys commented 3 years ago

I've sent PR https://github.com/uken/fluent-plugin-elasticsearch/pull/870

kenhys commented 3 years ago

Is there a dynamic version of elasticsearch_data_stream?

No. but @elasticsearch_data_stream supports the placeholder feature, does not work for you?

snemetz commented 3 years ago

Currently my upstream fluent-bit agents create an index field and set it to what ES index the log should go to. The only way I figured out to have fluentd send it to the index specified was to use elasticsearch-dynamic

If there is another way, I'd be happy to hear

cosmo0920 commented 3 years ago

I'd released #870 as v5.0.1.

lololozhkin commented 3 years ago

Is there a dynamic version of elasticsearch_data_stream?

No. but @elasticsearch_data_stream supports the placeholder feature, does not work for you?

Hello, which version supports placeholders in data_stream? I am not able to use this feature in 5.1.0 version. Or could you please provide example configuration of using placeholders in such a way?

My configuration is here:

<label @OUTPUT>
  <match **>
    @type                elasticsearch_data_stream
    host                 elasticsearch-master
    port                 9200
    suppress_type_name   true

    data_stream_name     fluentd.${$.kubernetes.container_name}
    data_stream_ilm_name delete-after-14-days

    <buffer _index, $.kubernetes.container_name>
      flush_thread_count 5
      @type              file
      path               /var/log/fluentd/buffer/
      timekey            10m
      chunk_limit_size   10m
      total_limit_size   10g
      flush_mode         interval
      flush_interval     1m
      overflow_action    drop_oldest_chunk
      retry_type         exponential_backoff
      retry_wait         5s
      retry_max_interval 60s
      retry_randomize    true
      retry_forever      true
    </buffer>
  </match>
</label>

Using this configuration, fluentd creates datastream with name fluentd.${$.kubernetes.container_name}, which is not what I want.

martonorova commented 2 years ago

Hi @lololozhkin,

were you perhaps able to resolve your issue? I am struggling with the same thing

lololozhkin commented 2 years ago

were you perhaps able to resolve your issue? I am struggling with the same thing

Hi @martonorova, the problem was in the version. Update to version upper than 5.1.0 should resolve your problem!

marcin-brzozowski commented 2 years ago

Hi @lololozhkin , I'm also trying to make it work. using the sample that you have provided. I'm still observing the same behaviour (despite running version (5.1.4) of the plugin) - meaning fluentd creates datastream with name fluentd.${$.kubernetes.container_name}.

I'm using kibana / elasticsearch version 7.14.0.

I wonder if you could post some more information about how you made it work (full fluentd config, fluentd version, plugin version etc.)? Thank you!

Environment:

I'm running fluentd from this docker image: fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-elasticsearch7-1.0 inside Kubernetes cluster. When I open shell to my container and verify plugin version, it says:

root@fluentd-2bnnh:/fluentd# gem list |grep -i fluent-plugin-elasticsearch
fluent-plugin-elasticsearch (5.1.4)

And this is part of the congi map that I'm using in my deployment.

fileConfigs:
  01_sources.conf: |-
    <source>
      @type tail
      @id in_tail_container_logs
      @label @KUBERNETES
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      <parse>
        @type cri
      </parse>
      emit_unmatched_lines true
    </source>

  02_filters.conf: |-
    <label @KUBERNETES>
      <match kubernetes.var.log.containers.fluentd**>
        @type relabel
        @label @FLUENT_LOG
      </match>

      # <match kubernetes.var.log.containers.**_kube-system_**>
      #   @type null
      #   @id ignore_kube_system_logs
      # </match>

      <filter kubernetes.**>
        @type kubernetes_metadata
        @id filter_kube_metadata
        skip_labels false
        skip_container_metadata false
        skip_namespace_metadata false
        skip_master_url false
      </filter>

      <match **>
        @type relabel
        @label @DISPATCH
      </match>
    </label>

  03_dispatch.conf: |-
    <label @DISPATCH>
      <filter **>
        @type prometheus
        <metric>
          name fluentd_input_status_num_records_total
          type counter
          desc The total number of incoming records
          <labels>
            tag ${tag}
            hostname ${hostname}
          </labels>
        </metric>
      </filter>

      <match **>
        @type relabel
        @label @OUTPUT
      </match>
    </label>

  04_outputs.conf: |-
    <label @OUTPUT>
      <match **>
        @type                elasticsearch_data_stream
        host                 elasticsearch-master
        port                 9200
        suppress_type_name   true

        data_stream_name     fluentd.${$.kubernetes.container_name}
        data_stream_ilm_name foo
      </match>
    </label>

I've also pulled out one of the log messages to the stdout, to make sure that the kubernetes labels are there, and so they are:

{
    "stream": "stdout",
    "logtag": "F",
    "message": "I removed this msg...",
    "time": "2022-01-18T12:59:48.0980824Z",
    "docker": {
        "container_id": "60d7f5efa369b90aa5d816f20db2099fa1aa11b65ee448ad649fd56992b80fc1"
    },
    "kubernetes": {
        "container_name": "kibana",
        "namespace_name": "default",
        "pod_name": "kibana-kibana-79855ccc65-qxx57",
        "container_image": "docker.elastic.co/kibana/kibana:7.14.0",
        "container_image_id": "docker.elastic.co/kibana/kibana@sha256:a1c80a2b22f6c9a93a089c8b983078d482e6dad5e693c64e84b491afd0e90f53",
        "pod_id": "6c2c78ed-3bf0-49be-9760-d135b65038ac",
        "pod_ip": "10.42.0.10",
        "host": "k3d-hello-server-0",
        "labels": {
            "app": "kibana",
            "pod-template-hash": "79855ccc65",
            "release": "kibana"
        },
        "master_url": "https://10.43.0.1:443/api",
        "namespace_id": "ecb2a711-800c-4a72-bfbe-a97bb0ebf936",
        "namespace_labels": {
            "kubernetes_io/metadata_name": "default"
        }
    }
}