fluent / fluent-plugin-grok-parser

Fluentd's Grok parser
Other
107 stars 31 forks source link

Multiline_grok question #79

Closed antonionappi88 closed 5 years ago

antonionappi88 commented 5 years ago

Hello, This is more a question on how the plugin works than a bug.

I'm trying to parse some logs and send them to Elastic Search using fluentd but I have some issues with grok parsing.

I have two type of logs :

One line:

15:16:22,544 INFO  [org.jboss.weld.deployer] (MSC service thread 1-1) WFLYWELD0003: Processing weld deployment jaxb-core-2.2.11.jar

Multiple line:

15:16:22,538 INFO  [org.jboss.as.ejb3.deployment.processors.EjbJndiBindingsDeploymentUnitProcessor] (MSC service thread 1-1) JNDI bindings for session bean named PartsEJB in deployment unit subdeployment "eam-light-backendejb-1.0.0-SNAPSHOT.jar" of deployment "eam-light-backend.ear" are as follows:

        java:global/eam-light-backend/eam-light-backendejb-1.0.0-SNAPSHOT/PartsEJB
        java:app/eam-light-backendejb-1.0.0-SNAPSHOT/PartsEJB
        java:module/PartsEJB

Below my fluentd configuration:

<source>
  @type tail
  path /opt/jboss/wildfly/standalone/log/server.log
  pos_file /var/log/td-agent/server.log.pos
  path_key source
  tag server.log
  <parse>
    @type multiline_grok
    grok_success_key groksuccess
    grok_failure_key grokfailure
    <grok>
      pattern %{TIME:jboss_timestamp} %{LOGLEVEL:log_level} %{GREEDYDATA:message}
      multiline_start_regexp /^[0-9]
    </grok>
  </parse>
</source>

<filter server.log>
  @type record_transformer
  <record>
    application "#{ENV['POD_NAMESPACE']}"
    jboss_timestamp ${record["jboss_timestamp"]}
    source ${record["source"]}
    log_level ${record["log_level"]
    k8s_cluster "#{ENV['KUBERNETES_CLUSTER']}"
    technology wildfly
  </record>
</filter>

<match server.log>
  @type elasticsearch
  include_timestamp true
  host host
  port port
  user port
  password port
  index_name my-index
  scheme https
  ssl_verify false
  ssl_version TLSv1_2
  <buffer>
    flush_mode immediate
  </buffer>
</match>

I tested the grok pattern with grok debugger and it works. Instead in the logs of fluentd:

{"jboss_timestamp":"15:16:22,544","log_level":"${record[\"log_level\"]","message":" [org.jboss.weld.deployer] (MSC service thread 1-1) WFLYWELD0003: Processing weld deployment jaxb-core-2.2.11.jar","source":"/opt/jboss/wildfly/standalone/log/server.log","application":"default","k8s_cluster":"k8s-ims-dev-a","technology":"wildfly","@timestamp":"2019-10-23T17:16:22.551045310+02:00"}

and also parse multiple line of logs in different ones.

Do you know what I'm doing wrong ? Thanks!

Cheers, Antonio

ganmacs commented 5 years ago

I didn't check in detail and I'm not sure it will work. but multiline_start_regexp /^[0-9](quoted from your config) is missing last /.

and multiline_start_regexp should be located under parser section not grok section. see https://github.com/fluent/fluent-plugin-grok-parser/tree/2c9901c1a3dc473b28ee3c04cd4780f6b5aeede4#multiline-support

antonionappi88 commented 5 years ago

Hi @ganmacs , you're right, moving the multiline_start_regexp fixed it. And also I was doing a mistake in the match field. I was missing a } for the log_level.

Thanks! Antonio