fluent / fluentd-kubernetes-daemonset

Fluentd daemonset for Kubernetes and it Docker image
Apache License 2.0
1.26k stars 980 forks source link

fluent can not parse multiline correctly #46

Closed aeoluswing closed 3 years ago

aeoluswing commented 7 years ago

image version: fluent/fluentd-kubernetes-daemonset:v0.12-debian docker version: 1.12.5 plugin:

# gem list|grep fluent-plugin
fluent-plugin-kubernetes_metadata_filter (0.27.0)
fluent-plugin-multiline-parser (0.1.1)
fluent-plugin-record-reformer (0.9.0)
fluent-plugin-redis (0.2.3)
fluent-plugin-secure-forward (0.4.5)
fluent-plugin-systemd (0.0.8)

after deploying this image via kubernetes daemonsets,i found parser cannot deal with multiline log.

java trancestack log as follows:

2017-08-10 15:12:11.994 [com.pdcss.sfjc.db.DBPool:55] ERROR com.pdcss.sfjc.db.DBPool - Network error IOException: Connection timed out: connect
  java.sql.SQLException: Network error IOException: Connection timed out: connect
   at net.sourceforge.jtds.jdbc.ConnectionJDBC2.<init>(ConnectionJDBC2.java:421)
   at net.sourceforge.jtds.jdbc.ConnectionJDBC3.<init>(ConnectionJDBC3.java:50)
   at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:185)

the filter conf like this:

<filter container.error>
      @type parser
      format multiline
      format_firstline /\d{4}-\d{1,2}-\d{1,2}/
      format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}\.\d{0,3}) \[(?<thread>.*)\] (?<level>[^\s]+) (?<classmathod>[^\s]+) - (?<message>[\s\S]*)/
      key_name all
</filter>

docker log:

2017-08-23 15:29:00 +0800 [warn]: pattern not match with data '  java.sql.SQLException: Network error IOException: Connection timed out: connect'
2017-08-23 15:29:00 +0800 [warn]: pattern not match with data '   at net.sourceforge.jtds.jdbc.ConnectionJDBC2.<init>(ConnectionJDBC2.java:421)'
2017-08-23 15:29:00 +0800 [warn]: pattern not match with data '   at net.sourceforge.jtds.jdbc.ConnectionJDBC3.<init>(ConnectionJDBC3.java:50)'
2017-08-23 15:29:00 +0800 [warn]: pattern not match with data '   at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:185)'
2017-08-10 15:28:11 +0800 container.error: {"thread":"com.pdcss.sfjc.db.DBPool:55","level":"ERROR","classmathod":"com.pdcss.sfjc.db.DBPool","message":"Network error IOException: Connection timed out: connect"}

as context log said,filter can parser single line log correctly,but incorrectly when parsing multiline such as tracestack. i guess the format param is wrong with the line break(\n) but no idea about this config. by the way,as documentation fluent multiline parser mentioned,format (?<message>.*) also not work. Any solution on this issue,please?

lubing1101 commented 7 years ago

i have't met this problem before

Stelminator commented 6 years ago

https://github.com/GoogleCloudPlatform/fluent-plugin-detect-exceptions ?

PiotrKlimczak commented 6 years ago

I got stuck on exactly same issue. I am trying for almost whole day now. Unfortunately I am unfamiliar with Ruby :( Any solution to that?

I am only suspecting that multiline doesn't work at all with filter. The only example I found with filter is on author wiki page: https://github.com/quick-sort/fluent-plugin-multiline-parser However all other examples are in source, which would make sense from design perspective as otherwise implementation would be very complex, requiring for example unique key to correlated logs into one during filtering phase. But I might be wrong as I just started going deeper with fluentd.

PiotrKlimczak commented 6 years ago

I got it finally working by redirecting application logs to separate files using rsyslog, then fluentd parsing those files using tail plugin + multiline, while ignoring duplicates coming from systemd journal. Haven't found any other solution than that.

Ritwik28 commented 6 years ago

can you provide the whole config file.

monotek commented 6 years ago

Try: https://github.com/monotek/fluent-plugin-detect-exceptions

utkarshmani1997 commented 6 years ago

Hi, i am facing the same issue, here is the config file I am using the same configuration given here

apiVersion: v1
metadata:
  name: fluentd-es-config-v0.1.4
  namespace: logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  system.conf: |-
    <system>
      root_dir /tmp/fluentd-buffers/
    </system>
  containers.input.conf: |-
    # This configuration file for Fluentd / td-agent is used
    # to watch changes to Docker log files. The kubelet creates symlinks that
    # capture the pod name, namespace, container name & Docker container ID
    # to the docker logs for pods in the /var/log/containers directory on the host.
    # If running this fluentd configuration in a Docker container, the /var/log
    # directory should be mounted in the container.
    #
    # These logs are then submitted to Elasticsearch which assumes the
    # installation of the fluent-plugin-elasticsearch & the
    # fluent-plugin-kubernetes_metadata_filter plugins.
    # See https://github.com/uken/fluent-plugin-elasticsearch &
    # https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter for
    # more information about the plugins.
    #
    # Example
    # =======
    # A line in the Docker log file might look like this JSON:
    #
    # {"log":"2014/09/25 21:15:03 Got request with path wombat\n",
    #  "stream":"stderr",
    #   "time":"2014-09-25T21:15:03.499185026Z"}
    #
    # The time_format specification below makes sure we properly
    # parse the time format produced by Docker. This will be
    # submitted to Elasticsearch and should appear like:
    # $ curl 'http://elasticsearch-logging:9200/_search?pretty'
    # ...
    # {
    #      "_index" : "logstash-2014.09.25",
    #      "_type" : "fluentd",
    #      "_id" : "VBrbor2QTuGpsQyTCdfzqA",
    #      "_score" : 1.0,
    #      "_source":{"log":"2014/09/25 22:45:50 Got request with path wombat\n",
    #                 "stream":"stderr","tag":"docker.container.all",
    #                 "@timestamp":"2014-09-25T22:45:50+00:00"}
    #    },
    # ...
    #
    # The Kubernetes fluentd plugin is used to write the Kubernetes metadata to the log
    # record & add labels to the log record if properly configured. This enables users
    # to filter & search logs on any metadata.
    # For example a Docker container's logs might be in the directory:
    #
    #  /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b
    #
    # and in the file:
    #
    #  997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
    #
    # where 997599971ee6... is the Docker ID of the running container.
    # The Kubernetes kubelet makes a symbolic link to this file on the host machine
    # in the /var/log/containers directory which includes the pod name and the Kubernetes
    # container name:
    #
    #    synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #    ->
    #    /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
    #
    # The /var/log directory on the host is mapped to the /var/log directory in the container
    # running this instance of Fluentd and we end up collecting the file:
    #
    #   /var/log/containers/synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #
    # This results in the tag:
    #
    #  var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #
    # The Kubernetes fluentd plugin is used to extract the namespace, pod name & container name
    # which are added to the log message as a kubernetes field object & the Docker container ID
    # is also added under the docker field object.
    # The final tag is:
    #
    #   kubernetes.var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #
    # And the final log record look like:
    #
    # {
    #   "log":"2014/09/25 21:15:03 Got request with path wombat\n",
    #   "stream":"stderr",
    #   "time":"2014-09-25T21:15:03.499185026Z",
    #   "kubernetes": {
    #     "namespace": "default",
    #     "pod_name": "synthetic-logger-0.25lps-pod",
    #     "container_name": "synth-lgr"
    #   },
    #   "docker": {
    #     "container_id": "997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b"
    #   }
    # }
    #
    # This makes it easier for users to search for logs by pod name or by
    # the name of the Kubernetes container regardless of how many times the
    # Kubernetes pod has been restarted (resulting in a several Docker container IDs).
    # Json Log Example:
    # {"log":"[info:2016-02-16T16:04:05.930-08:00] Some log text here\n","stream":"stdout","time":"2016-02-17T00:04:05.931087621Z"}
    # CRI Log Example:
    # 2016-02-17T00:04:05.931087621Z stdout F [info:2016-02-16T16:04:05.930-08:00] Some log text here
    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/es-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag raw.kubernetes.*
      format json
      read_from_head true
    </source>
    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>
  system.input.conf: |-
    # Example:
    # 2015-12-21 23:17:22,066 [salt.state       ][INFO    ] Completed state [net.ipv4.ip_forward] at time 23:17:22.066081
    <source>
      @id minion
      @type tail
      format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
      time_format %Y-%m-%d %H:%M:%S
      path /var/log/salt/minion
      pos_file /var/log/salt.pos
      tag salt
    </source>
    # Example:
    # Dec 21 23:17:22 gke-foo-1-1-4b5cbd14-node-4eoj startupscript: Finished running startup script /var/run/google.startup.script
    <source>
      @id startupscript.log
      @type tail
      format syslog
      path /var/log/startupscript.log
      pos_file /var/log/es-startupscript.log.pos
      tag startupscript
    </source>
    # Examples:
    # time="2016-02-04T06:51:03.053580605Z" level=info msg="GET /containers/json"
    # time="2016-02-04T07:53:57.505612354Z" level=error msg="HTTP Error" err="No such image: -f" statusCode=404
    # TODO(random-liu): Remove this after cri container runtime rolls out.
    <source>
      @id docker.log
      @type tail
      format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
      path /var/log/docker.log
      pos_file /var/log/es-docker.log.pos
      tag docker
    </source>
    # Example:
    # 2016/02/04 06:52:38 filePurge: successfully removed file /var/etcd/data/member/wal/00000000000006d0-00000000010a23d1.wal
    <source>
      @id etcd.log
      @type tail
      # Not parsing this, because it doesn't have anything particularly useful to
      # parse out of it (like severities).
      format none
      path /var/log/etcd.log
      pos_file /var/log/es-etcd.log.pos
      tag etcd
    </source>
    # Multi-line parsing is required for all the kube logs because very large log
    # statements, such as those that include entire object bodies, get split into
    # multiple lines by glog.
    # Example:
    # I0204 07:32:30.020537    3368 server.go:1048] POST /stats/container/: (13.972191ms) 200 [[Go-http-client/1.1] 10.244.1.3:40537]
    <source>
      @id kubelet.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kubelet.log
      pos_file /var/log/es-kubelet.log.pos
      tag kubelet
    </source>
    # Example:
    # I1118 21:26:53.975789       6 proxier.go:1096] Port "nodePort for kube-system/default-http-backend:http" (:31429/tcp) was open before and is still needed
    <source>
      @id kube-proxy.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-proxy.log
      pos_file /var/log/es-kube-proxy.log.pos
      tag kube-proxy
    </source>
    # Example:
    # I0204 07:00:19.604280       5 handlers.go:131] GET /api/v1/nodes: (1.624207ms) 200 [[kube-controller-manager/v1.1.3 (linux/amd64) kubernetes/6a81b50] 127.0.0.1:38266]
    <source>
      @id kube-apiserver.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-apiserver.log
      pos_file /var/log/es-kube-apiserver.log.pos
      tag kube-apiserver
    </source>
    # Example:
    # I0204 06:55:31.872680       5 servicecontroller.go:277] LB already exists and doesn't need update for service kube-system/kube-ui
    <source>
      @id kube-controller-manager.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-controller-manager.log
      pos_file /var/log/es-kube-controller-manager.log.pos
      tag kube-controller-manager
    </source>
    # Example:
    # W0204 06:49:18.239674       7 reflector.go:245] pkg/scheduler/factory/factory.go:193: watch of *api.Service ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [2578313/2577886]) [2579312]
    <source>
      @id kube-scheduler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-scheduler.log
      pos_file /var/log/es-kube-scheduler.log.pos
      tag kube-scheduler
    </source>
    # Example:
    # I1104 10:36:20.242766       5 rescheduler.go:73] Running Rescheduler
    <source>
      @id rescheduler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/rescheduler.log
      pos_file /var/log/es-rescheduler.log.pos
      tag rescheduler
    </source>
    # Example:
    # I0603 15:31:05.793605       6 cluster_manager.go:230] Reading config from path /etc/gce.conf
    <source>
      @id glbc.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/glbc.log
      pos_file /var/log/es-glbc.log.pos
      tag glbc
    </source>
    # Example:
    # I0603 15:31:05.793605       6 cluster_manager.go:230] Reading config from path /etc/gce.conf
    <source>
      @id cluster-autoscaler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/cluster-autoscaler.log
      pos_file /var/log/es-cluster-autoscaler.log.pos
      tag cluster-autoscaler
    </source>
    # Logs from systemd-journal for interesting services.
    # TODO(random-liu): Remove this after cri container runtime rolls out.
    <source>
      @id journald-docker
      @type systemd
      filters [{ "_SYSTEMD_UNIT": "docker.service" }]
      <storage>
        @type local
        persistent true
      </storage>
      read_from_head true
      tag docker
    </source>
    <source>
      @id journald-container-runtime
      @type systemd
      filters [{ "_SYSTEMD_UNIT": "{{ container_runtime }}.service" }]
      <storage>
        @type local
        persistent true
      </storage>
      read_from_head true
      tag container-runtime
    </source>
    <source>
      @id journald-kubelet
      @type systemd
      filters [{ "_SYSTEMD_UNIT": "kubelet.service" }]
      <storage>
        @type local
        persistent true
      </storage>
      read_from_head true
      tag kubelet
    </source>
    <source>
      @id journald-node-problem-detector
      @type systemd
      filters [{ "_SYSTEMD_UNIT": "node-problem-detector.service" }]
      <storage>
        @type local
        persistent true
      </storage>
      read_from_head true
      tag node-problem-detector
    </source>
  forward.input.conf: |-
    # Takes the messages sent over TCP
    <source>
      @type forward
    </source>
  monitoring.conf: |-
    # Prometheus Exporter Plugin
    # input plugin that exports metrics
    <source>
      @type prometheus
    </source>
    <source>
      @type monitor_agent
    </source>
    # input plugin that collects metrics from MonitorAgent
    <source>
      @type prometheus_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
    # input plugin that collects metrics for output plugin
    <source>
      @type prometheus_output_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
    # input plugin that collects metrics for in_tail plugin
    <source>
      @type prometheus_tail_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      host elasticsearch-logging
      port 9200
      logstash_format true
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>  
  elasticsearch-template-es5x.json: |
    {
      "template" : "logstash-*",
      "version" : 50001,
      "settings" : {
        "index.refresh_interval" : "5s",
        "number_of_shards": 1
      },
      "mappings" : {
        "_default_" : {
          "_all" : {"enabled" : true, "norms" : false},
          "dynamic_templates" : [ {
            "message_field" : {
              "path_match" : "message",
              "match_mapping_type" : "string",
              "mapping" : {
                "type" : "text",
                "norms" : false
              }
            }
          }, {
            "string_fields" : {
              "match" : "*",
              "match_mapping_type" : "string",
              "mapping" : {
                "type" : "text", "norms" : false,
                "fields" : {
                  "keyword" : { "type": "keyword" }
                }
              }
            }
          } ],
          "properties" : {
            "@timestamp": { "type": "date", "include_in_all": false },
            "@version": { "type": "keyword", "include_in_all": false },
            "geoip"  : {
              "dynamic": true,
              "properties" : {
                "ip": { "type": "ip" },
                "location" : { "type" : "geo_point" },
                "latitude" : { "type" : "half_float" },
                "longitude" : { "type" : "half_float" }
              }
            }
          }
        }
      }
    }
  extra.conf: |
    # Example filter that adds an extra field "cluster_name" to all log
    # messages:
    # <filter **>
    #   @type record_transformer
    #   <record>
    #     cluster_name "your_cluster_name"
    #   </record>
    # </filter
xuanyuanaosheng commented 6 years ago

@aeoluswing Have you found a solution to the problem? Can you provide the config file ?

PiotrKlimczak commented 6 years ago

So 3.6/3.7 OCP was using docker with journal logging driver. From 3.9 it is using json-file logging driver.

To get it working in 3.9 for example, you need:

  1. Fixed version of: https://github.com/quick-sort/fluent-plugin-multiline-parser/blob/master/lib/fluent/plugin/out_parser.rb as original has bugs and will not work
  2. You need to change default in_tail sources as below:
    <source>
    @type tail
    @label @INGRESS
    path "/var/log/containers/*.log"
    pos_file "/var/log/containers/es-containers.log.pos"
    exclude_path ["/var/log/containers/*.gz", "/var/log/containers/*.zip", "/var/log/containers/YOUR_APP_PREFIX-*.log", "/var/log/containers/fuse-*.log"]
    time_format %Y-%m-%dT%H:%M:%S.%N%Z
    tag kubernetes.other.*
    format json
    keep_time_key true
    read_from_head "true"
    </source>
    <source>
    @type tail
    @label @INGRESS
    path /var/log/containers/YOUR_APP_PREFIX-*
    pos_file /var/log/containers/_YOUR_APP_PREFIX-position.pos
    exclude_path ["/var/log/containers/*.gz", "/var/log/containers/*.zip", "/var/log/containers/*.pos"]
    tag kubernetes.YOUR_APP_PREFIX.*
    format json
    read_from_head "true"
    </source>
  3. In fluentd config map, you need to change fluent.conf key, by adding inside of INGRESS label:
    <filter kubernetes.YOUR_APP_PREFIX.**>
    @type multiline_parser
    format multiline
    format_firstline /^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} /
    format1 /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})[ ]*(?<log4j.level>[^ ]*)[ ]*\[(?<log4j.thread>.+?(?=\] \[))\] \[(?<log4j.logger>[^\]]*)\] (?<message>.*)/
    key_name log
    filter_key_name log
    time_format %Y-%m-%d %H:%M:%S,%L
    keep_time_key true
    </filter>
  4. You might need to adjust time also and other records, so add under above in same file:
    <filter kubernetes.YOUR_APP_PREFIX.**>
    @type record_transformer
    enable_ruby
    <record>
      time ${Time.parse(record["time"]).to_datetime.rfc3339(9)}
      stream stdout
    </record>
    </filter>

Now what is important, is that this custom filter linked in step one, is capable of joining lines in filter level, instead of source level which changes a lot. Also it is capable of extracting next lines from "log" record of json-file format, as with standard in_tail multiline format, you will get json tags in joined lines. In general the multiline plugin bundled with fluentd is so limited, that it is useless and requires hell a lot of workarounds to get it working. I've got original working too in 3.6 with journal log driver, but solution was so complex and required rsyslog changes to produce separate logs per container to file (as original multiline works only as in_tail source, not as filter) and then lots of customisations in fluentd.conf file. Not worth trying as solution is too complex and too clunky. Just use that custom filter multiline parser as per step 1.

RC-2 commented 6 years ago

I am facing the same problem, and as per my investigation so far, the problem is related to the way to logs are sent to the log files. For example:

Therefore the rules will not find the correct match.

Let me know if the above is not correct, if the above is correct you need first to remove the information from the json field log and perform the multiline on it.

I am working based on the above assumption, but, so far I haven't found a solution.

chroche commented 6 years ago

I seem to have it working with the following configuration (using td-agent v0.12.40 on Debian 9 Stretch). Basically using the multiline parser as an output directive (match) rather than a filter. The messages are coming from Docker through the fluentd logging driver

{
    "log-driver": "fluentd",
    "log-opts": {
      "fluentd-address": "unix:///var/run/td-agent/td-agent.sock",
      "tag": "api"
    }
}

and look like this:

INFO  [2018-07-14 15:02:00,712] [TRACE-ID:] o.e.j.s.h.ContextHandler: Started i.d.j.MutableServletContextHandler@5d4ff150{/,null,AVAILABLE
    extra line
    more lines
WARN  [2018-07-14 15:02:00,725] [TRACE-ID:] i.d.s.AdminEnvironment:
    blah blah

The td-agent config is as follows:

<source>
  @type unix
  path /var/run/td-agent/td-agent.sock
</source>

<match api.**>
  @type parser
  format multiline
  key_name log
  format_firstline /^[A-Z]/
  format1 /^(?<crit>\w*)\s*\[(?<time>[^\]]*)\]\s*\[TRACE-ID:(?<traceid>[^\]]*)\]\s*(?<message>.*)/
  remove_prefix api
</match>

<match **>
  @type kinesis_firehose
  region us-east-1
  delivery_stream_name logging-firehose-stream
  include_tag_key true
  include_time_key true
</match>

With this the lines are properly concatenated and uploaded to Firehose then ElasticSearch. The only remaining issue I have is that the lines are concatenated as one big string, which doesn't display nicely in ES. I'm using a small patched version of out_parser.rb to preserve the newline characters between consecutive lines.

RC-2 commented 6 years ago

I solved my problem by creating another fluentd container and installing fluent-plugin-concat and using this new container instead of the default one provided here

jaredready commented 6 years ago

@raf-d Would you mind sharing the configuration you ended up with? I've been going down the same path as well.

ifeldshteyn commented 5 years ago

@raf-d Would you mind sharing the configuration you ended up with? I've been going down the same path as well.

I know this is a year old, but this is what we ended up doing. Works perfectly and you can join multiple multiline formats with a | symbol. Ex: A|B|C etc... Now we have full java stacktraces in a single event.

<filter **>
  @type concat
  key log
  multiline_start_regexp /\w+ \d+.\d+.\d+ \d+:\d+:\d+: /
  stream_identity_key tag
</filter>

Note: Rancher is going the path of adding official concat support ---> https://github.com/rancher/fluentd/pull/15

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

github-actions[bot] commented 3 years ago

This issue was automatically closed because of stale in 30 days