fabric8io / fluent-plugin-kubernetes_metadata_filter

Enrich your fluentd events with Kubernetes metadata
Apache License 2.0
350 stars 166 forks source link

Metadata is not being attached to log output #296

Closed rstoermer closed 1 year ago

rstoermer commented 3 years ago

With the following configmap logs are correctly sent as JSON to Elastic, but a kubernetes field is never attached.

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: {{ .Values.namespace }}
data:
  fluent.conf: |-
    ################################################################
    <label @FLUENT_LOG>
    <match fluent.**>
        @type null
        @id ignore_fluent_logs
    </match>
    </label>

    # This source gets all logs from local docker host
    @include in.k8.container.orchestrator.conf
    @include out.elastic.conf
  in.k8.container.orchestrator.conf: |-
    <source>
        @type tail
        path /var/log/containers/*orchestrator*.log
        pos_file /var/log/k8.container.orchestrator.pos
        tag kubernetes.container.orchestrator
        read_from_head true
        <parse>
            @type kubernetes
            @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        </parse>
    </source>

    <filter kubernetes.container.orchestrator>
    @type parser
    reserve_data true
    key_name log
    remove_key_name_field true
    keep_time_key true
    format json
    </filter>

    <filter kubernetes.container.orchestrator>
    @type record_transformer
    enable_ruby
    renew_time_key time
        <record>
            time ${record['time'] / 1000}
        </record>
    </filter>

  out.elastic.conf: |-
    <filter kubernetes.**>
        @type kubernetes_metadata
    </filter>

    <match **>
      @type elasticsearch
        host "#{ENV['FLUENT_ELASTICSEARCH_HOST'] || 'elasticsearch.elastic-kibana'}"
        port "#{ENV['FLUENT_ELASTICSEARCH_PORT'] || '9200'}"
        type_name fluentd
        include_tag_key true
        tag_key @log_name
        logstash_format true
        logstash_dateformat %Y.%m
    </match>

As soon as I add the metadata plugin in my config, the following output is generated periodically though:

2021-06-15 14:56:24 +0000 [info]: #0 stats - namespace_cache_size: 12, pod_cache_size: 88, pod_cache_watch_misses: 17, pod_cache_watch_ignored: 4, pod_cache_watch_delete_ignored: 4, pod_cache_watch_updates: 12, pod_cache_host_updates: 88, namespace_cache_host_updates: 12

jcantrill commented 3 years ago

With the following configmap logs are correctly sent as JSON to Elastic, but a kubernetes field is never attached.

in.k8.container.orchestrator.conf: |-

    @type tail
    path /var/log/containers/*orchestrator*.log

Do you have a sample of the full file path as this is the information from which information is used to query the api server. If the file does not match this regex then it is not possible to get metadata https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/blob/master/lib/fluent/plugin/filter_kubernetes_metadata.rb#L56

</source>

out.elastic.conf: |- <filter kubernetes.**> @type kubernetes_metadata



As soon as I add the metadata plugin in my config, the following output is generated periodically though:

2021-06-15 14:56:24 +0000 [info]: #0 stats - namespace_cache_size: 12, pod_cache_size: 88, pod_cache_watch_misses: 17, pod_cache_watch_ignored: 4, pod_cache_watch_delete_ignored: 4, pod_cache_watch_updates: 12, pod_cache_host_updates: 88, namespace_cache_host_updates: 12

This is stats info of the metadata plugin where you can modify the interval as advised here https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/issues/236

rstoermer commented 3 years ago

With the following configmap logs are correctly sent as JSON to Elastic, but a kubernetes field is never attached.

in.k8.container.orchestrator.conf: |-

@type tail path /var/log/containers/orchestrator.log

Do you have a sample of the full file path as this is the information from which information is used to query the api server. If the file does not match this regex then it is not possible to get metadata https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/blob/master/lib/fluent/plugin/filter_kubernetes_metadata.rb#L56

</source>

out.elastic.conf: |- <filter kubernetes.**> @type kubernetes_metadata


As soon as I add the metadata plugin in my config, the following output is generated periodically though:

2021-06-15 14:56:24 +0000 [info]: #0 stats - namespace_cache_size: 12, pod_cache_size: 88, pod_cache_watch_misses: 17, pod_cache_watch_ignored: 4, pod_cache_watch_delete_ignored: 4, pod_cache_watch_updates: 12, pod_cache_host_updates: 88, namespace_cache_host_updates: 12

This is stats info of the metadata plugin where you can modify the interval as advised here #236

Sure thing. the full path is: /var/log/containers/orchestrator-deployment-569b4956f9-dqrrp_nextor-dev_orchestrator-d24cd3e56ce0579c82e8fbf29b2308001bdc0805fe4cddf339ba97dfa5491892.log. The running deployment is: orchestrator-deployment-569b4956f9-dqrrp after a kubectl get pods

Agasper commented 3 years ago

Just wanted to post similar issue. Spent two days to figure it out. If source tag is constant i.e in your case "tag kubernetes.container.orchestrator" it doesn't work, but if i change it to *"tag kubernetes.json."** it works

joenzx commented 3 years ago

I ran into the same problem,kubernetes_metadata takes effect when the tag is kubernetes.* ` kind: ConfigMap apiVersion: v1 metadata: name: fluentd-config namespace: fluentd data: system.conf: |-

root_dir /tmp/fluentd-buffers/

containers.input.conf: |-

<source>
  @id etcd.log
  @type tail
  path /var/log/containers/etcd*.log
  pos_file /home/fluend_ok/etcd.log.pos
  read_from_head true
  tag kube-etcd
  <parse>
    @type json
    time_type string
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    keep_time_key true
  </parse>
</source>
<filter **>
  @id filter_kubernetes_metadata
  @type kubernetes_metadata
</filter>

` output: 2021-07-23 06:46:39.497063660 +0800 kube-etcd: {"log":"2021-07-23 06:46:39.496902 I | mvcc: finished scheduled compaction at 17755366 (took 15.249023ms)\n","stream":"stderr","time":"2021-07-23T06:46:39.49706366Z"}

` kind: ConfigMap apiVersion: v1 metadata: name: fluentd-config namespace: fluentd data: system.conf: |-

root_dir /tmp/fluentd-buffers/

containers.input.conf: |-

<source>
  @id etcd.log
  @type tail
  path /var/log/containers/etcd*.log
  pos_file /home/fluend_ok/etcd.log.pos
  read_from_head true
  tag kube.*
  <parse>
    @type json
    time_type string
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    keep_time_key true
  </parse>
</source>
<filter **>
  @id filter_kubernetes_metadata
  @type kubernetes_metadata
</filter>
<match **>
  @type stdout
 </match>

` ouput: 2021-07-14 01:51:18.308188239 +0800 kube.var.log.containers.etcd-dev-server-1_kube-system_etcd-a83594dd4d26644d6d02506e2b816226795f7367495faefca899f4d9cad602c6.log: {"log":"2021-07-14 01:51:18.308021 I | mvcc: store.index: compact 15731708\n","stream":"stderr","time":"2021-07-14T01:51:18.308188239Z","docker":{},"kubernetes":{"container_name":"etcd","namespace_name":"kube-system","pod_name":"etcd-dev-server-1","container_image":"etcd:3.4.3-0","pod_ip":"10.10.0.10","host":"dev-server-1","labels":{"component":"etcd","tier":"control-plane"},"namespace_labels":{"field_cattle_io/projectId":"p"}}}

Why does tag affect kubernetes_metadata? @jcantrill

jcantrill commented 3 years ago

` ouput: 2021-07-14 01:51:18.308188239 +0800 kube.var.log.containers.etcd-dev-server-1_kube-system_etcd-a83594dd4d26644d6d02506e2b816226795f7367495faefca899f4d9cad602c6.log: {"log":"2021-07-14 01:51:18.308021 I | mvcc: store.index: compact 15731708\n","stream":"stderr","time":"2021-07-14T01:51:18.308188239Z","docker":{},"kubernetes":{"container_name":"etcd","namespace_name":"kube-system","pod_name":"etcd-dev-server-1","container_image":"etcd:3.4.3-0","pod_ip":"10.10.0.10","host":"dev-server-1","labels":{"component":"etcd","tier":"control-plane"},"namespace_labels":{"field_cattle_io/projectId":"p"}}}

Why does tag affect kubernetes_metadata? @jcantrill

The plugin can not magically determine pod information from a random tag, for the kubernetes case, the tag is the file name as found on the node. The tag (filename) includes: namespace, podname, container. The podname and namespace are used to query the api server to get the pod labels. Without such information there is no conceivable way to retrieve the metadata

jcantrill commented 3 years ago

I finally see the issue you are describing. This is a configuration error that looks like maybe needs to be addressed in either documentation or configuration option. This plugin has no way of getting the parameters needed to query the API server unless it is told how. Currently it is only possible via regex'ing the info from the tag which must have namespace and podname at a minimum. I encourage you to open a PR for a solution to your usecase.

jcantrill commented 1 year ago

Closing given plugin relies upon regex of the tag to get info for querying the api server