humio / humio-helm-charts

Helm Charts for Humio Components
Apache License 2.0
9 stars 17 forks source link

Json and json-for-action parser not working #134

Open giovannicandido opened 3 years ago

giovannicandido commented 3 years ago

Logs are shipped from kubernetes and processed with json parser or json-for-action parser The json produced by the containers are ok I have tested with the json parser in the console input.

I think the problem is because the rawstring has a time appended to all logs. For example:

image

The rawstring:

2021-11-09T16:27:39.510951151-05:00 stdout F {"@timestamp":"2021-11-09T21:27:39.505Z","@version":"1","message":"HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection@d89aadb (This connection has been closed.). Possibly consider using a shorter maxLifetime value.","logger_name":"com.zaxxer.hikari.pool.PoolBase","thread_name":"http-nio-8080-exec-5","level":"WARN","level_value":30000}

It has a "date stdout F" before each json

The pods are annotated with label humio-parser

giovannicandido commented 3 years ago

I found an answer in https://github.com/microsoft/fluentbit-containerd-cri-o-json-log the problem is my container runtime is containerd, which requires a different parser than the default docker parser.

To fix the issue in humio helm chart we need the following:

humio-fluentbit:
  parserConfig: |-
    [PARSER]
        Name apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
    [PARSER]
        Name apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
    [PARSER]
        Name apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$
    [PARSER]
        Name nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
    [PARSER]
        Name json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
    [PARSER]
        Name docker
        Format json
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
    [PARSER]
        Name syslog
        Format regex
        Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key time
        Time_Format %b %d %H:%M:%S
    [PARSER]
        Name cri
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z
  inputConfig: |-
    [INPUT]
      Name             tail
      Path             /var/log/containers/*.log
      Parser           cri
      Tag              kube.*
      Refresh_Interval 5
      Mem_Buf_Limit    5MB
      Skip_Long_Lines  On   

Which add the cri parser and override the parser in the input config. Containerd and cri-o will became much popular with the deprecation of docker for kubernetes. When I had some time I will make a merge request to provide a option to the chart to facilitate this setup for users.

HenrikDK commented 2 years ago

Any update on this from the humio team? We're facing the same issue on our AKS clusters.