aws / aws-for-fluent-bit

The source of the amazon/aws-for-fluent-bit container image
Apache License 2.0
458 stars 135 forks source link

Log content not parsed when using Kubernetes annotations to suggest a parser #298

Open ph-l opened 2 years ago

ph-l commented 2 years ago

Describe the question/issue

Using the built-in AWS Fluent Bit with EKS on Fargate, the parser defined in Kubernetes annotations on pods do not process logs.

Configuration

Fluent Bit config

kind: ConfigMap
apiVersion: v1
metadata:
    name: aws-logging
    namespace: aws-observability
data:
    # Can test regex parser here: https://rubular.com/

    flb_log_cw: "true" #ships fluent-bit process logs to CloudWatch

    filters.conf: |
        [FILTER]
            Name                kubernetes
            Match               kube.*
            Merge_Log           On
            Buffer_Size         0
            Kube_Meta_Cache_TTL 300s
            K8S-Logging.Parser  On
            K8S-Logging.Exclude On
            Merge_Parser        catchall

    parsers.conf: |
        [PARSER]
            Name            myparser
            Format          regex
            Regex           ^(?<time>[^ ]+) (?<mystream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
            Time_Key        time
            Time_Format     %Y-%m-%dT%H:%M:%S.%L%z

        [PARSER]
            Name            catchall
            Format          Regex
            Regex           ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
            Time_Key        time
            Time_Format     %Y-%m-%dT%H:%M:%S.%L%z

        [OUTPUT]
            Name                cloudwatch_logs
            Match               kube.var.logs.containers.*
            region              ap-southeast-1
            log_group_name      fluent-bit-cloudwatch
            log_stream_prefix   fargate-
            auto_create_group   true

Pod

apiVersion: v1
kind: Pod
metadata:
    name: apache-logs
    labels:
      app: apache-logs
    annotations:
      fluentbit.io/parser: myparser
spec:
    containers:
    - name: apache
      image: edsiper/apache_logs

Fluent Bit Log Output

{ "log": "File /var/log/containers/*.log not present, sleeping 1" }
{ "log": "\u001b[1mFluent Bit v1.7.9\u001b[0m" }
{ "log": "* \u001b[1m\u001b[93mCopyright (C) 2019-2021 The Fluent Bit Authors\u001b[0m" }
{ "log": "* \u001b[1m\u001b[93mCopyright (C) 2015-2018 Treasure Data\u001b[0m" }
{ "log": "* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd" }
{ "log": "* https://fluentbit.io" }
{ "log": "[2022/02/11 08:01:21] [ info] [engine] started (pid=689)" }
{ "log": "[2022/02/11 08:01:21] [ info] [storage] version=1.1.1, initializing..." }
{ "log": "[2022/02/11 08:01:21] [ info] [storage] in-memory" }
{ "log": "[2022/02/11 08:01:21] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128" }
{ "log": "[2022/02/11 08:01:21] [ info] [filter:kubernetes:kubernetes.0] https=1 host=xxx.xxx.ap-southeast-1.eks.amazonaws.com port=443" }
{ "log": "[2022/02/11 08:01:21] [ warn] [filter:kubernetes:kubernetes.0] cannot open /var/run/secrets/kubernetes.io/serviceaccount/namespace" }
{ "log": "[2022/02/11 08:01:21] [ info] [filter:kubernetes:kubernetes.0] Fluent Bit not running in a POD" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter log_group_name = 'nonprd-cluster-fluent-bit-logs'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter default_log_group_name = 'fluentbit-default'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter log_stream_prefix = ''\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter log_stream_name = 'from-fluent-bit-xxx-875db96b6-xtbgn-dev'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter default_log_stream_name = '/fluentbit-default'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter region = 'ap-southeast-1'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter log_key = ''\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter role_arn = ''\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter auto_create_group = 'true'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter new_log_group_tags = ''\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter log_retention_days = '0'\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter endpoint = ''\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter sts_endpoint = ''\"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter credentials_endpoint = \"" }
{ "log": "time=\"2022-02-11T08:01:21Z\" level=info msg=\"[cloudwatch 0] plugin parameter log_format = ''\"" }
{ "log": "[2022/02/11 08:01:21] [ info] [sp] stream processor started" }
{ "log": "[2022/02/11 08:01:22] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1198127 watch_fd=1 name=/var/log/containers/xxx-875db96b6-xtbgn_dev_envoy-436c1bf758adf1c1266717adfee10f2a29dc57fa8abd5e6693d3c7dce423767c.log" }
{ "log": "[2022/02/11 08:01:22] [ info] [input:tail:tail.1] inotify_fs_add(): inode=1197902 watch_fd=1 name=/var/log/fluent-bit.log" }
{ "log": "[2022/02/11 08:01:26] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Creating log group fluent-bit-cloudwatch" }
{ "log": "[2022/02/11 08:01:26] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Log Group fluent-bit-cloudwatch already exists" }
{ "log": "[2022/02/11 08:01:26] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Creating log stream platform-kube.var.log.containers.xxx-875db96b6-xtbgn_dev_envoy-436c1bf758adf1c1266717adfee10f2a29dc57fa8abd5e6693d3c7dce423767c.log in log group fluent-bit-cloudwatch" }
{ "log": "[2022/02/11 08:01:26] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Created log stream platform-kube.var.log.containers.xxx-875db96b6-xtbgn_dev_envoy-436c1bf758adf1c1266717adfee10f2a29dc57fa8abd5e6693d3c7dce423767c.log" }
{ "log": "time=\"2022-02-11T08:01:27Z\" level=info msg=\"[cloudwatch 0] Log group nonprd-cluster-fluent-bit-logs already exists\\n\"" }
{ "log": "time=\"2022-02-11T08:01:27Z\" level=info msg=\"[cloudwatch 0] Created log stream from-fluent-bit-xxx-875db96b6-xtbgn-dev in group nonprd-cluster-fluent-bit-logs\"" }
{ "log": "[2022/02/11 08:01:41] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1329144 watch_fd=2 name=/var/log/containers/xxx-875db96b6-xtbgn_dev_xxx-e64c97ef1ad99ba4cd5ac118150eb3ac4e2f0339a7fe7f06c4c500a92ca65bfd.log" }
{ "log": "[2022/02/11 08:02:01] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Creating log stream platform-kube.var.log.containers.xxx-875db96b6-xtbgn_dev_xxx-e64c97ef1ad99ba4cd5ac118150eb3ac4e2f0339a7fe7f06c4c500a92ca65bfd.log in log group fluent-bit-cloudwatch" }
{ "log": "[2022/02/11 08:02:01] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Created log stream platform-kube.var.log.containers.xxx-875db96b6-xtbgn_dev_xxx-e64c97ef1ad99ba4cd5ac118150eb3ac4e2f0339a7fe7f06c4c500a92ca65bfd.log" }
{ "log": "[2022/02/11 08:02:21] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1330324 watch_fd=3 name=/var/log/containers/xxx-875db96b6-xtbgn_dev_fluent-bit-f3305d87dd370df40e2b04614a84612350925fa8fbd6a454e00be67beab00295.log" }
{ "log": "[2022/02/11 08:02:26] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Creating log stream platform-kube.var.log.containers.xxx-875db96b6-xtbgn_dev_fluent-bit-f3305d87dd370df40e2b04614a84612350925fa8fbd6a454e00be67beab00295.log in log group fluent-bit-cloudwatch" }
{ "log": "[2022/02/11 08:02:26] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Created log stream platform-kube.var.log.containers.xxx-875db96b6-xtbgn_dev_fluent-bit-f3305d87dd370df40e2b04614a84612350925fa8fbd6a454e00be67beab00295.log" }
{ "log": "[2022/02/11 08:08:34] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1330324 watch_fd=3" }
{ "log": "[2022/02/11 08:08:34] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1329144 watch_fd=2" }
{ "log": "[2022/02/11 08:08:34] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1198127 watch_fd=1" }

Fluent Bit Version Info

Based on Fluent Bit process logs, it's Fluent Bit 1.7.9. I can't tell which version of AWS for Fluent Bit is used in Fargate.

Cluster Details

EKS on Fargate

Application Details

Steps to reproduce issue

I get the following (redacted) log record with my configuration. It appears that the parser suggested by the pod isn't used. I expect to see mystream key defined in annotation parser instead of stream key defined in Merge_Parser.

{
    "stream": "stdout",
    "logtag": "F",
    "log": "2022-02-11 16:08:20.887 message",
    "kubernetes": {
        "pod_name": "xxx-875db96b6-xtbgn",
        "namespace_name": "dev",
        "annotations": {
            "fluentbit.io/parser": "myparser",
        },
        "host": "fargate-ip-xx-xx-xx-xx.ap-southeast-1.compute.internal",
        "container_name": "xxx",
    }
}

If I specify an undefined parser in the pod annotations, I'll get a warning message in Fluent Bit's logs

"log": "[2022/02/11 07:45:32] [ warn] [filter:kubernetes:kubernetes.0] annotation parser 'invalid_parser' not found (ns='dev' pod_name='xxx-76d59ddfc6-x4nrg', container_name='xxx')"

If Merge_Parser is not set, the log content isn't processed even with the annotation parser defined.

{
    "log": "2022-02-11T09:01:18.09884086Z stdout F 2022-02-11 16:08:20.887 message",
    "kubernetes": {
        "pod_name": "xxx-875db96b6-xtbgn",
        "namespace_name": "dev",
        "annotations": {
            "fluentbit.io/parser": "myparser",
        },
        "host": "fargate-ip-xx-xx-xx-xx.ap-southeast-1.compute.internal",
        "container_name": "xxx",
    }
}
PettitWesley commented 2 years ago

Thank you for this report. I believe this is an issue in EKS Fargate, not Fluent Bit itself. I will check on this.

PettitWesley commented 2 years ago

EKS Fargate team is aware of the issue. Unfortunately I can't currently give any more information publicly.

PettitWesley commented 2 years ago

https://github.com/aws/containers-roadmap/issues/1625

elite-tech commented 1 year ago

Hello. Are there any updates on this issue?

PettitWesley commented 1 year ago

@elite-tech does the fix mentioned here work for you: https://github.com/aws/containers-roadmap/issues/1625#issuecomment-1231023420

elite-tech commented 1 year ago

Unfortunately no @PettitWesley. The issue I have is related to the Kubernetes filter ignoring parsers set with annotations as defined here.

If I set the annotation "fluentbit.io/parser: cri_ping", the "cri_ping" parser is not used, it is ignored.The parser defined in the filter definition "cri" is still used. Below is the filter definition.

    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Merge_Parser cri
        K8S-Logging.Parser On
        K8S-Logging.Exclude On
PettitWesley commented 1 year ago

@elite-tech does this only happen on EKS Fargate or with Fluent Bit in general?

Can you share your full config map and parser definition along with the pod annotation configuration.

elite-tech commented 1 year ago

This happens with EKS fargate. I have not tried using Fluentbit directly.

Here is the config map:

kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
data:
  flb_log_cw: "true"  #ships fluent-bit process logs to CloudWatch

  output.conf: |
    [OUTPUT]
        Name kinesis_firehose
        Match *
        region us-east-1
        delivery_stream test-I20q76WS

  parsers.conf: |
    [PARSER]
        Name cri
        Format Regex
        Regex ^(?<time>[^ ]+) (stdout|stderr) (P|F) (?<log>.*)$
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z
    [PARSER]
        Name cri_ping
        Format Regex
        Regex ^(?<time>[^ ]+) (stdout|stderr) (P|F) (?<sourcefile>/.+):  (?<log>.*)$
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

  filters.conf: |
    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Merge_Parser cri
        K8S-Logging.Parser On
        K8S-Logging.Exclude On

Below are the annotations on the Pod

apiVersion: v1
 kind: Pod
 metadata:
   annotations:
     CapacityProvisioned: 2vCPU 5GB
     Logging: LoggingEnabled
     checksum/config: 7ae2cfd76cd1410753d86645a7fff95455fr402fa43acf715603aadb1688c9de
     fluentbit.io/exclude-log-config-updater: "true"
     fluentbit.io/parser: cri_ping
     kubectl.kubernetes.io/restartedAt: "2023-01-10T13:17:49-06:00"
     kubernetes.io/psp: eks.privileged
PettitWesley commented 1 year ago

@elite-tech thank you. If I get some free time I can try to repro this myself. However, in the meantime, can you try to repro this on a k8s cluster that's not fargate? (like a local cluster may be). I'm wondering if the issue is that your cri_ping parser does not actually work on the logs. It looks very similar to the normal cri parser, which I think should already be applied to the logs before your annotation parser is applied. The cri built-in multiline parser should be applied by the tail input, though I actually don't see it in the docs, I will check on this: https://docs.aws.amazon.com/eks/latest/userguide/fargate-logging.html

EDIT/Update: I have confirmed from EKS Fargate team that the real tail input has this:

  multiline.parser  docker, cri

We will get the docs updated. So this means that cri parsing logic will already be applied to your logs before they reach the kubernetes filter.

Looking at the code, it looks like it either uses the annotation parser, or the merge_log parser, not both: https://github.com/fluent/fluent-bit/blob/master/plugins/filter_kubernetes/kubernetes.c#L130

I would check the regex in that parser against the actual log content using this site recommended by Fluent community: https://rubular.com/

Do the logs match that regex?

fitchtech commented 1 year ago

@ph-l @PettitWesley I think the issue is if the parser has an underscore _ in the name. When I try to set the annotation fluentbit.io/parser: java_multiline I'll get [filter:kubernetes:kubernetes.0] annotation parser 'java_multiline' not found

Even though I do have a parser named java_multiline in parsers.conf. However, if I set the annotation to a parser without an underscore, it works. e.g. fluentbit.io/parser: json

Doesn't seem to be an issue if there's a dash - in the name, but I get an error for any parser with _ in it when using the Kubernetes annotation.

ph-l commented 1 year ago

@fitchtech Can you share your Fluent Bit config and your pod manifest? My original example wasn't using underscores. I tried it again today and it's still not working for me.