fabric8io / fluent-plugin-kubernetes_metadata_filter

Enrich your fluentd events with Kubernetes metadata
Apache License 2.0
350 stars 166 forks source link

fluent-plugin-kubernetes_metadata_filter, a plugin for Fluentd

Circle CI Code Climate Test Coverage Ruby Style Guide Ruby Style Guide

The Kubernetes metadata plugin filter enriches container log records with pod and namespace metadata.

This plugin derives basic metadata about the container that emitted a given log record using the source of the log record. Records from kubernetes containers encode metadata about the container in the file name. The initial metadata derived from the source is used to lookup additional metadata about the container's associated pod and namespace (e.g. UUIDs, labels, annotations) when the kubernetes_url is configured. If the plugin cannot authoritatively determine the namespace of the container emitting a log record, it will use an 'orphan' namespace ID in the metadata. This behaviors supports multi-tenant systems that rely on the authenticity of the namespace for proper log isolation.

Requirements

fluent-plugin-kubernetes_metadata_filter fluentd ruby
>= 2.10.0 >= v1.10.0 >= 2.6
>= 2.5.0 >= v1.10.0 >= 2.5
>= 2.0.0 >= v0.14.20 >= 2.1
< 2.0.0 >= v0.12.0 >= 1.9

NOTE: For v0.12 version, you should use 1.x.y version. Please send patch into v0.12 branch if you encountered 1.x version's bug.

NOTE: This documentation is for fluent-plugin-kubernetes_metadata_filter-plugin-elasticsearch 2.x or later. For 1.x documentation, please see v0.12 branch.

Installation

gem install fluent-plugin-kubernetes_metadata_filter

Configuration

Configuration options for fluent.conf are:

Reading from a JSON formatted log files with in_tail and wildcard filenames while respecting the CRI-o log format with the same config you need the fluent-plugin "multi-format-parser":

fluent-gem install fluent-plugin-multi-format-parser

The config block could look like this:

<source>
  @type tail
  path /var/log/containers/*.log
  pos_file fluentd-docker.pos
  read_from_head true
  tag kubernetes.*
  <parse>
    @type multi_format
    <pattern>
      format json
      time_key time
      time_type string
      time_format "%Y-%m-%dT%H:%M:%S.%NZ"
      keep_time_key false
    </pattern>
    <pattern>
      format regexp
      expression /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
      time_format '%Y-%m-%dT%H:%M:%S.%N%:z'
      keep_time_key false
    </pattern>
  </parse>
</source>

<filter kubernetes.var.log.containers.**.log>
  @type kubernetes_metadata
</filter>

<match **>
  @type stdout
</match>

Environment variables for Kubernetes

If the name of the Kubernetes node the plugin is running on is set as an environment variable with the name K8S_NODE_NAME, it will reduce cache misses and needless calls to the Kubernetes API.

In the Kubernetes container definition, this is easily accomplished by:

env:
- name: K8S_NODE_NAME
  valueFrom:
    fieldRef:
      fieldPath: spec.nodeName

Example input/output

Kubernetes creates symlinks to Docker log files in /var/log/containers/*.log. Docker logs in JSON format.

Assuming following inputs are coming from a log file named /var/log/containers/fabric8-console-controller-98rqc_default_fabric8-console-container-df14e0d5ae4c07284fa636d739c8fc2e6b52bc344658de7d3f08c36a2e804115.log:

{
  "log": "2015/05/05 19:54:41 \n",
  "stream": "stderr",
  "time": "2015-05-05T19:54:41.240447294Z"
}

Then output becomes as belows

{
  "log": "2015/05/05 19:54:41 \n",
  "stream": "stderr",
  "docker": {
    "container_id": "df14e0d5ae4c07284fa636d739c8fc2e6b52bc344658de7d3f08c36a2e804115",
  }
  "kubernetes": {
    "host": "jimmi-redhat.localnet",
    "pod_name":"fabric8-console-controller-98rqc",
    "pod_id": "c76927af-f563-11e4-b32d-54ee7527188d",
    "pod_ip": "172.17.0.8",
    "container_name": "fabric8-console-container",
    "namespace_name": "default",
    "namespace_id": "23437884-8e08-4d95-850b-e94378c9b2fd",
    "namespace_annotations": {
      "fabric8.io/git-commit": "5e1116f63df0bac2a80bdae2ebdc563577bbdf3c"
    },
    "namespace_labels": {
      "product_version": "v1.0.0"
    },
    "labels": {
      "component": "fabric8Console"
    }
  }
}

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Test it (GEM_HOME=vendor bundle install; GEM_HOME=vendor bundle exec rake test)
  5. Push to the branch (git push origin my-new-feature)
  6. Create new Pull Request

Copyright

Copyright (c) 2015 jimmidyson