fluent / fluentd-kubernetes-daemonset

Fluentd daemonset for Kubernetes and it Docker image
Apache License 2.0
1.26k stars 978 forks source link

Support containerd log format #412

Open byrnedo opened 4 years ago

byrnedo commented 4 years ago

Hi, I'm running k3s using containerd instead of docker. The log format is different to docker's. AFAIK it would just involve changing the @type json to a regex for the container logs, see https://github.com/rancher/k3s/issues/356#issuecomment-485080611 Would anyone be up for doing this? With maybe some kind of env var to switch on the container d support, eg CONTAINER_RUNTIME=docker as default, with containerd as an alternative

arthurdarcet commented 4 years ago

you can add the env variable FLUENT_CONTAINER_TAIL_PARSER_TYPE with value /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/ - it'll make this daemonset ok with the containerd logs

strigazi commented 4 years ago

The above regex worked for me! thanks!

Could we make it work for both containerd and docker without setting the type?

faust64 commented 4 years ago

Hi,

It may look like it works, though having dealt with OpenShift a lot lately: you're missing something. Eventually, you'll see log messages being split into several records.

I've had to patch the /fluentd/etc/kubernetes.conf file.

We could indeed set FLUENT_CONTAINER_TAIL_PARSER_TYPE to /^(?<time>.+) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.+)$/.

However we also need to add the following:

    <filter kubernetes.**>
      @type concat
      key log
      partial_key logtag
      partial_value P
      separator ""
    </filter>

Note that I'm setting a logtag field, from the F and P values, that @arthurdarcet drops with a [^ ]*. We actually need those re-constructing multi-line message (P means you have a partial log, while F notes the last part of a message).

vfolk30 commented 4 years ago

I have enabled rancher logging with fluentd for containerD , but still getting issue.Below our the enn variable i have pasted in daemon set https://rancher.com/docs/rancher/v2.x/en/cluster-admin/tools/logging/

env:

output:

2020-08-11 16:07:29 +0000 [warn]: #0 pattern not matched: "2020-08-11T18:07:28.606198265+02:00 stdout F 2020-08-11 16:07:28 +0000 [warn]: #0 pattern not matched: \"2020-08-11T18:07:27.620512318+02:00 stdout F 2020-08-11 16:07:27 +0000 [warn]: #0 pattern not matched: \\\"2020-08-11T18:07:26.541424158+02:00 stdout F 2020-08-11 16:07:26 +0000 [warn]: #0 pattern not matched: \\\\\\\"2020-08-11T18:07:25.531461018+02:00 stdout F 2020-08-11 16:07:25 +0000 [warn]: #0 pattern not matched: \\\\\\\\\\\\\\\"2020-08-11T18:07:24.528268248+02:00 stdout F 2020-08-11 16:07:24 +0000 [warn]: #0 pattern not matched: \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"2020-08-11T18:07:23.524149263+02:00 stdout F 2020-08-11 16:07:23 +0000 [warn]: #0 pattern not matched: \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"2020-08-11T18:07:23.187045754+02:00 stdout F 2020-08-11 16:07:23.186 [INFO][57] int_dataplane.go 976: Finished applying updates to dataplane. msecToApply=1.434144\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\"\\\\\\\"\\\"\""

arren-ru commented 4 years ago

@arthurdarcet @faust64 How the regex string supposed to work in FLUENT_CONTAINER_TAIL_PARSER_TYPE if that variable translated to @type value in the configuration of parser? kubernetes.conf inside container contains:

…
<source>
  @type tail
  …
  <parse>
    @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
    time_format %Y-%m-%dT%H:%M:%S.%NZ
  </parse>
</source>
…

Allowed @types https://docs.fluentd.org/configuration/parse-section#type

DarkBlaez commented 4 years ago

Maybe this should just be addressed with a flag. The issue has been present for such a long time and impacts other vendors that choose to spin value-added products around this. Word to the wise, docker is not the only front-end to containers and container evolution continues. Addressing this now external to the sloppy work arounds with regex or manipulation would be a good thing. Better to get in front of the issue than lag behind.

DB

repeatedly commented 4 years ago

We can put additional plugin into plugins directory, e.g. https://github.com/fluent/fluentd-kubernetes-daemonset/tree/master/docker-image/v1.11/debian-elasticsearch7/plugins So if anyone provides containerd log format parser, we can configure it via FLUENT_CONTAINER_TAIL_PARSER_TYPE.

DarkBlaez commented 4 years ago

That would work as I am willing to write a custom parser for this to contribute and save others the same issues. Or re-phrased, that is perhaps the best option, an additional plugin for this specific use case

faust64 commented 4 years ago

@arren-ru : you are right, my mistake. FLUENTD_CONTAINER_TAIL_PARSER_TYPE should be set to regexp, and then you'ld set anexpression, with your actual regexp.

Either way, that's not something you can currently configure only using environment variables. You're looking for something like this: https://github.com/faust64/kube-magic/blob/master/custom/roles/logging/templates/fluentd.j2#L31-L51

arren-ru commented 4 years ago

@faust64 I solved this by overriding kubernetes.conf with configmap mounted in place of original configuration with modified content, this gives basic working solution

      <source>
        @type tail
        @id in_tail_container_logs
        path /var/log/containers/*.log
        pos_file /var/log/fluentd-containers.log.pos
        tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
        exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
        read_from_head true
        <parse>
          @type regexp
          expression /^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<flags>[^ ]+) (?<message>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
        </parse>
      </source>

      <filter kubernetes.**>
        @type kubernetes_metadata
        @id filter_kube_metadata
        kubernetes_url "#{'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
      </filter>
i300543 commented 3 years ago

The kind of solutions presented here will cause json log to be parsed as a string, and no fields defined in json itself will be recongnized as Elasticsearch fields correct ?

arren-ru commented 3 years ago

The kind of solutions presented here will cause json log to be parsed as a string, and no fields defined in json itself will be recongnized as Elasticsearch fields correct ?

Not sure understood you, but CRI logs are represented as a string line, this is not a docker logs, so if you want to parse json further you may want to add pipelined parser or filter

mickdewald commented 3 years ago

I got an issue, that my logfile was filled with backslashes. I am using containerd instead of docker. I solved it by putting in the following configuration:

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
m-usmanayub commented 3 years ago

I got an issue, that my logfile was filled with backslashes. I am using containerd instead of docker. I solved it by putting in the following configuration:

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/

Did not work for me on 1.20.1 hosted on VMs. Still the same backslashes full of error

vipinjn24 commented 3 years ago

I am using containerd as the CRI for kubernetes and used FLUENT_CONTAINER_TAIL_PARSER_TYPE env var. But it seems that now the logs are somewhat readable but the time format is incorrect so the error is shown for that.

Any solution to this problem or can we change the time format by any env var?

vipinjn24 commented 3 years ago

Ok, got it on how to fix this one.

First we know that we need to change the logging format as containerd do not use json format and is a regular text format. so we add the below environment variable to the daemonset.

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.*)$/

Now when we do this, it still shows error with the time format. To solve this we will extract the kubernetes.conf file from a running fluentd container and copy the contents to a config map and mount that value to the kubernetes.conf location i.e. /fluentd/etc/kubernetes.conf.

  volumeMounts:
  - name: fluentd-config
    mountPath: /fluentd/etc/kubernetes.conf
    subPath: kubernetes.conf
volumes:
- name: fluentd-config
  configMap:
    name: fluentd-config
    items:
    - key: kubernetes.conf
      path: kubernetes.conf

So to fix the error, we update the following value inside the source.

<source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
      exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
      read_from_head true
      <parse>
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format %Y-%m-%dT%H:%M:%S.%N%:z
      </parse>
</source>

time_format %Y-%m-%dT%H:%M:%S.%NZ to time_format %Y-%m-%dT%H:%M:%S.%N%:z

Now deploy the daemonset, it will work.

cosmo0920 commented 3 years ago

I'd published the new parser included images: https://github.com/fluent/fluentd-kubernetes-daemonset/pull/521, https://github.com/fluent/fluentd-kubernetes-daemonset/commit/2736b688e405c9da4675eb4e23ade916d009ec53

With FLUENT_CONTAINER_TAIL_PARSER_TYPE, we can specify cri type parser for parsing CRI format Logs.

ref: https://github.com/fluent/fluentd-kubernetes-daemonset#use-cri-parser-for-containerdcri-o-logs

hari819 commented 3 years ago

we are facing this issue with slashes "\\" , we use v1.12-debian-elasticsearch7-1 version of daemonset , we are currently testing the workarounds mentioned in this issue ,

would like to know if there would be a newer version of the daemonset after fixing the issue or do we need to use the workarounds permanently,

thanks,

optimus-kart commented 3 years ago

With BDRK-3386 is this issue fixed?

faust64 commented 3 years ago

From what I can see, there's still no way to concatenate partial logs coming from containerd or cri-o. Nor to pass a regular expression, when FLUENT_CONTAINER_TAIL_PARSER_TYPE would be set to regexp

Containerd and cri-o requires something such as, reconstructing logs split into multiple lines (partials):

    <filter kubernetes.**>
      @type concat
      key log
      partial_key logtag
      partial_value P
      separator ""
    </filter>

The filter above relies on some logtag, defined as following:

      <parse>
         @type regexp
         expression /^(?<logtime>.+) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.*)$/i
         time_key logtime
         time_format %Y-%m-%dT%H:%M:%S.%N%Z
      </parse>

I'm not sure how to make those filter block and regexpr conditional. Nor that we can come up with a configuration that would suit both containerd/cri-o and Docker. You would also need to change some inputs sources (systemd units, docker log file). I've given up and written my own ConfigMap, based on the configuration shipping in this image, fixing the few bits I need.

huangzixun123 commented 2 years ago

you can add the env variable FLUENT_CONTAINER_TAIL_PARSER_TYPE with value /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/ - it'll make this daemonset ok with the containerd logs

I was stuck this question all day until i see you answer! Love this answer and the author !!!!!!!!!!!!!!

ethanhallb commented 2 years ago

As per discussion and this change, make sure to turn off greedy parsing for the timestamp. e.g.

^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$ ^(?<time>.+?) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$

With greedy parsing, there's a chance of runaway logging (log errors caused by scraping log errors). Context:

https://github.com/fluent/fluent-bit/pull/5078 https://github.com/fluent/fluent-bit/commit/cf239c2194551bb31ebf42ae075eb847fc326ec6

nhnam6 commented 1 year ago

Ok, got it on how to fix this one.

First we know that we need to change the logging format as containerd do not use json format and is a regular text format. so we add the below environment variable to the daemonset.

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.*)$/

Now when we do this, it still shows error with the time format. To solve this we will extract the kubernetes.conf file from a running fluentd container and copy the contents to a config map and mount that value to the kubernetes.conf location i.e. /fluentd/etc/kubernetes.conf.

  volumeMounts:
  - name: fluentd-config
    mountPath: /fluentd/etc/kubernetes.conf
    subPath: kubernetes.conf
volumes:
- name: fluentd-config
  configMap:
    name: fluentd-config
    items:
    - key: kubernetes.conf
      path: kubernetes.conf

So to fix the error, we update the following value inside the source.

<source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
      exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
      read_from_head true
      <parse>
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format %Y-%m-%dT%H:%M:%S.%N%:z
      </parse>
</source>

time_format %Y-%m-%dT%H:%M:%S.%NZ to time_format %Y-%m-%dT%H:%M:%S.%N%:z

Now deploy the daemonset, it will work.

cool. its work well

huangzixun123 commented 1 year ago

Hi, i have received your email. Thanks!This is a auto-reply email.

helloxk617 commented 1 year ago

Ok, got it on how to fix this one.

First we know that we need to change the logging format as containerd do not use json format and is a regular text format. so we add the below environment variable to the daemonset.

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.*)$/

Now when we do this, it still shows error with the time format. To solve this we will extract the kubernetes.conf file from a running fluentd container and copy the contents to a config map and mount that value to the kubernetes.conf location i.e. /fluentd/etc/kubernetes.conf.

  volumeMounts:
  - name: fluentd-config
    mountPath: /fluentd/etc/kubernetes.conf
    subPath: kubernetes.conf
volumes:
- name: fluentd-config
  configMap:
    name: fluentd-config
    items:
    - key: kubernetes.conf
      path: kubernetes.conf

So to fix the error, we update the following value inside the source.

<source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
      exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
      read_from_head true
      <parse>
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format %Y-%m-%dT%H:%M:%S.%N%:z
      </parse>
</source>

time_format %Y-%m-%dT%H:%M:%S.%NZ to time_format %Y-%m-%dT%H:%M:%S.%N%:z

Now deploy the daemonset, it will work.

it works for me ! you are so gorgeous @vipinjn24

maitza commented 1 year ago

I have seperated file, outside kubernetes.conf named tail_container_parse.conf, inside:

<parse>
  @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
  time_format "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TIME_FORMAT'] || '%Y-%m-%dT%H:%M:%S.%NZ'}"
</parse>

Just use env FLUENT_CONTAINER_TAIL_PARSER_TIME_FORMAT in daemonset with above time_format fixed for me problem.

huangzixun123 commented 1 year ago

Hi, i have received your email. Thanks!This is a auto-reply email.

vipinjn24 commented 1 year ago

I have seperated file, outside kubernetes.conf named tail_container_parse.conf, inside:

<parse>
  @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
  time_format "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TIME_FORMAT'] || '%Y-%m-%dT%H:%M:%S.%NZ'}"
</parse>

Just use env FLUENT_CONTAINER_TAIL_PARSER_TIME_FORMAT in daemonset with above time_format fixed for me problem.

Hmm let me see this one.

QkiZMR commented 1 year ago

After reading the whole thread, and experimenting with different settings posted here I managed to set up fluentd working with OKD4.

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
- name: FLUENT_CONTAINER_TAIL_PARSER_TIME_FORMAT
  value: '%Y-%m-%dT%H:%M:%S.%N%:z'

I set these two env vars and it works without overwriting any config files in the container.

faust64 commented 1 year ago

For the record, as it's now the 7th answer suggesting this ... At some point, I gave the following sample: https://github.com/fluent/fluentd-kubernetes-daemonset/issues/412#issuecomment-912985090 This is still valid.

With something like /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/, your [^ ]* drops a character, that may be a P (partial) or an F (final). If you do not concat partial lines up until the next final one, you will, eventually, have some logs broken down into several records. At which point: good luck finding anything in Kibana/elasticsearch.

kfirfer commented 1 year ago

Hi

I dont know why the logs is not trying parsed as json first currently all my logs to elastic for example is under "log" field this is my containers.input.conf configuration:

    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/containers.log.pos
      tag raw.kubernetes.*
      read_from_head true
      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
        </pattern>
        <pattern>
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
        </pattern>
      </parse>
    </source>

    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>

    # Concatenate multi-line logs
    <filter **>
      @id filter_concat
      @type concat
      key log
      use_first_timestamp true
      multiline_end_regexp /\n$/
      separator ""
      timeout_label @NORMAL
      flush_interval 5
    </filter>

    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @id filter_kubernetes_metadata
      @type kubernetes_metadata
      skip_labels true
    </filter>

    # Fixes json fields in Elasticsearch
    <filter kubernetes.**>
      @id filter_parser
      @type parser
      key_name log
      reserve_time true
      reserve_data true
      remove_key_name_field true
      <parse>
        @type multi_format
        <pattern>
          format json
        </pattern>
        <pattern>
          format none
        </pattern>
      </parse> 
    </filter>
huangzixun123 commented 1 year ago

Hi, i have received your email. Thanks!This is a auto-reply email.

kfirfer commented 1 year ago

Hi

I dont know why the logs is not trying parsed as json first currently all my logs to elastic for example is under "log" field this is my containers.input.conf configuration:

    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/containers.log.pos
      tag raw.kubernetes.*
      read_from_head true
      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
        </pattern>
        <pattern>
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
        </pattern>
      </parse>
    </source>

    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>

    # Concatenate multi-line logs
    <filter **>
      @id filter_concat
      @type concat
      key log
      use_first_timestamp true
      multiline_end_regexp /\n$/
      separator ""
      timeout_label @NORMAL
      flush_interval 5
    </filter>

    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @id filter_kubernetes_metadata
      @type kubernetes_metadata
      skip_labels true
    </filter>

    # Fixes json fields in Elasticsearch
    <filter kubernetes.**>
      @id filter_parser
      @type parser
      key_name log
      reserve_time true
      reserve_data true
      remove_key_name_field true
      <parse>
        @type multi_format
        <pattern>
          format json
        </pattern>
        <pattern>
          format none
        </pattern>
      </parse> 
    </filter>

Nevermind, successed with this config:

    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/containers.log.pos
      tag raw.kubernetes.*
      #read_from_head true
      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
        </pattern>
        <pattern>
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
        </pattern>
      </parse>
    </source>

    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>

    ## Concatenate multi-line logs
    #<filter **>
    #  @id filter_concat
    #  @type concat
    #  key log
    #  use_first_timestamp true
    #  multiline_end_regexp /\n$/
    #  separator ""
    #  timeout_label @NORMAL
    #  flush_interval 5
    #</filter>

    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @id filter_kubernetes_metadata
      @type kubernetes_metadata
      skip_labels true
    </filter>

    # Fixes json fields in Elasticsearch
    <filter kubernetes.**>
      @id filter_parser
      @type parser
      key_name log
      reserve_time true
      reserve_data true
      remove_key_name_field true
      <parse>
        @type multi_format
        <pattern>
          format json
        </pattern>
        <pattern>
          format none
        </pattern>
      </parse>
    </filter>
rahulpandit21 commented 11 months ago

We are getting issue in cri parser of fluentbit after EKS Upgrade to 1.24

With below parser , log: prefix is missing when it forward logs to splunk

[PARSER]
    Name        cri
    Format      regex
    Regex       ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L%z