aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

[EKS] [request]: Want to match log records by k8s annotation: `rewrite_tag` filter don't work after `kubernetes` filter in Fargate embedded FluentBit logging #1697

Open 10hin opened 2 years ago

10hin commented 2 years ago

Community Note

Tell us about your request I want to use Kubernetes Pod annotations for routing/identifying destination log group in CloudWatch Logs.

Which service(s) is this request for? Fargate, EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

I want to change log routing destination by Pods annotation.

For example, I annotate Pods for web with app=web, and also do it for NodeJS application with app=node-app.

Then I tried following configuration.

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-logging
  namespace: aws-observability
data:
  flb_log_cw: "true"

  output.conf: |
    [OUTPUT]
        Name              cloudwatch_logs
        Match             kube.*
        region            ap-northeast-1
        log_group_name    fluent-bit-cloudwatch
        log_stream_prefix from-fluent-bit-on-fargate-
        auto_create_group true
    [OUTPUT]
        Name              cloudwatch_logs
        Match             log.to.app-log-web.group.*
        region            ap-northeast-1
        log_group_name    app-log-web
        log_stream_prefix from-fluent-bit-on-fargate-
        auto_create_group true
    [OUTPUT]
        Name              cloudwatch_logs
        Match             log.to.app-log-node-app.group.*
        region            ap-northeast-1
        log_group_name    app-log-node-app
        log_stream_prefix from-fluent-bit-on-fargate-
        auto_create_group true

  parsers.conf: |
    [PARSER]
        Name crio
        Format Regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

  filters.conf: |
    [FILTER]
        Name     parser
        Match    kube.*
        Key_name log
        Parser   crio
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Merge_Log           On
        Buffer_Size         0
        Kube_Meta_Cache_TTL 300s
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
    [FILTER]
        Name  rewrite_tag
        Match kube.*
        Rule $stream ^(std.+)$ log.to.app-log-$kubernetes['annotations']['app'].group.$TAG false

But above configuration and any other configuration with rewrite_tag not works.

In actual, if I commented out last [FILTER] section, the fargate logging works well, and logs of fluent-bit itself sent to <cluster name>-fluent-bit-logs log group. But add [FILTER] section, the fluent-bit log group not get log stream for restarted Pod.

Are you currently working around this issue? Try to use Fluent Bit side car.

Additional context

If I use similar configuration on managed NodeGroup and Fluent Bit run as DaemonSet, it works as expected.

Attachments Nothing special.

keperry commented 2 years ago

I just want to point out that if you use the rewrite_tag before the "kubernetes" filter It does work, but you have to have a different "Rule" (field name and regex) in the rewrite_tag since all of your parsed fields will not be there.

In the example above, you would have to change the following:

Something like:

 [FILTER]
        Name  rewrite_tag
        Match kube.*
        Rule $log \"stream\":\s"(std.+?[^\\])\" log.to.app-log-$kubernetes['annotations']['app'].group.$TAG false

This appears to work on a log block like the following:

{
    "level": "info",
    "@timestamp": "2022-06-14T14:37:50.832Z",
    "caller": "awesomepackage/awesomesource.go:42",
    "message": "This is a cool message",
    "applicationName": "my awesome app",
    "stream": "stdthis-isamazing",
    "dataCenter": "exciting-ds",
    "podIp": "9.9.9.9",
    "version": "v0.0.0",
    "env": "TEST"
}

Note: Please TEST the regex for your own scenario - this is just an example. If you need to use a space in your regex, use \s. I hope this saves the next person the hours it took to figure this out.

I feel like something broke this because I tested it a few months ago and I was able to get the rewrite_tag working after the kubernetes filter.

10hin commented 2 years ago

@keperry Thank you for your comment.

In my case, I want to match log records by kubernetes annotation value. So rewrite_tag required to put after kubernetes filter. Because, to get kubernetes annotation value, we need to call kubernetes API.

But your comment made me realize the point of my issue. Thanks.

rimaulana commented 1 year ago

@10hin I tested your configuration with a little bit modification and it works fine in my case. Here is the config I use that works

kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
data:
  flb_log_cw: "true"  # Set to true to ship Fluent Bit process logs to CloudWatch.
  filters.conf: |
    [FILTER]
        Name parser
        Match *
        Key_name log
        Parser crio
    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Keep_Log Off
        Buffer_Size 0
        Kube_Meta_Cache_TTL 300s
    [FILTER]
        Name          rewrite_tag
        Match         kube.*
        Rule          $stream ^(std.+)$ log.to.app-log-$kubernetes['annotations']['app'].group false
  output.conf: |
    [OUTPUT]
        Name                cloudwatch_logs
        Match               log.to.app-log-nginx.group
        region              us-east-2
        log_group_name      eks-autoscaling-nginx
        log_stream_prefix   from-fluent-bit-
        log_retention_days  60
        auto_create_group   true
  parsers.conf: |
    [PARSER]
        Name crio
        Format Regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

I removed the appended $TAG as it will not be used and waste of memory space. However, even with appended $TAG and output Match set to log.to.app-log-nginx.group.* it still work for me.

Youssef-Beltagy commented 1 year ago

Can you please explain why it works?