aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

[EKS][Fargate]: Kubernetes filter support in fluentbit logging #1203

Closed itsprdsw closed 3 years ago

itsprdsw commented 3 years ago

Community Note

Tell us about your request Usage of Parser filter to enrich logs in Fargate logging

Which service(s) is this request for? Fargate on EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We are collecting logs from Fargate pods to CloudWatch Logs using fluentbit provided by AWS EKS. It is mentioned in the documentation that Parser is a supported filter in filters.conf (in fact it is possible to define parsers) but we are not being able to send enriched logs with this filter to CloudWatch.

Fargate only validates the Filter, Output, and Parser specified in the Fluent Conf. Any sections provided other than Filter, Output, and Parser are ignored Fargate validates against the following supported filters: grep, kubernetes, parser, record_modifier, rewrite_tag, throttle, nest, and modify.

We are trying to use the following ConfigMap to send logs to CloudWatch (the parser is the same as the Container Runtime Interface one provided by FluentBit):

kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
  labels:
data:
  output.conf: |
    [OUTPUT]
        Name cloudwatch_logs
        Match *
        region eu-west-1
        log_group_name fluent-bit-cloudwatch
        log_stream_name test2
        auto_create_group On

  filters.conf: |
    [FILTER]
        Name grep
        Match *
        Regex log INFO

    [FILTER]
        Name parser
        Match *
        Key_name log
        Parser testparser
        Reserve_Data On
        Preserve_Key On    

  parsers.conf: |
    [PARSER]
        Name testparser
        Format regex
        Regex (?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

With this configuration, logs are not sent to CloudWatch.

However, if the [FILTER] section with the parser is removed (only grep filter is kept), logs like the following one are received by CloudWatch:

{
    "log": "2020-12-16T18:20:25.270211594Z stdout F 2020-12-16 18:20:25.2688|INFO|Microsoft.Hosting.Lifetime||Now listening on: http://[::]:80|"
}

If we try the same configuration in a docker container running FluentBit with Docker Compose (as recommended in FluentBit documentation to test a local pipeline):

docker-compose.yaml:

version: "3.7"

services:
  fluent-bit:
    image: fluent/fluent-bit:1.6.8
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - ./parsersTest.conf:/fluent-bit/etc/parsersTest.conf

fluent-bit.conf:

[SERVICE]
    Log_level debug
    Parsers_File /fluent-bit/etc/parsersTest.conf

[INPUT]
    Name dummy
    Dummy {"log": "2020-12-16T18:20:25.270211594Z stdout F 2020-12-16 18:20:25.2688|INFO|Microsoft.Hosting.Lifetime||Now listening on: http://[::]:80|"}

[FILTER]
    Name grep
    Match *
    Regex log INFO

[FILTER]
    Name parser
    Match *
    Key_name log
    Parser testparser
    Reserve_Data On
    Preserve_Key On 

[OUTPUT]
    Name stdout
    Match *

parsersTest.conf:

[PARSER]
    Name testparser
    Format regex
    Regex (?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)
    Time_Key time
    Time_Format %Y-%m-%dT%H:%M:%S.%L%z 

The expected output is achieved:

fluent-bit_1  | [0] dummy.0: [1608142825.270211594, {"stream"=>"stdout", "logtag"=>"F", "message"=>"2020-12-16 18:20:25.2688|INFO|Microsoft.Hosting.Lifetime||Now listening on: http://[::]:80|", "log"=>"2020-12-16T18:20:25.270211594Z stdout F 2020-12-16 18:20:25.2688|INFO|Microsoft.Hosting.Lifetime||Now listening on: http://[::]:80|"}]

So, it seems that something is not working with the parser filter in Fargate logging. A fix should be expected here, or if the problem is some misconfiguration, a reference in the documentation about how to configure parser filter (or if parser filter is not supported ,to specifiy it clearly in the docs).

Are you currently working around this issue? Currently, we are using sidecar containers with fluentd to send logs to CloudWatch. But getting rid of them is the main purpose of the Fargate built-in logging feature.

Additional context None

visit1985 commented 3 years ago

@mikestef9 This should be classified as a bug, as the docs state that parser is a supported filter.

mikestef9 commented 3 years ago

Hey @visit1985 this has been resolved for new clusters created as of last week. The fix is rolling out to existing clusters over the next few weeks.

itsprdsw commented 3 years ago

Hi @mikestef9 Would this fix be available performing a Kubernetes cluster version upgrade (e.g. from 1.16 to 1.17), or we just need to wait for this fix to be rolled out to our cluster?

Thanks for the fix 😃

mikestef9 commented 3 years ago

Yes, upgrading a minor version, ie 1.16 to 1.17, will pick up the fix. If you are already on 1.18, we will be releasing 1.19 support in near future, and upgrading to 1.19 will also pick up the fix

visit1985 commented 3 years ago

@mikestef9 thanks!

The following config gives a good result now.

kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
data:
  parsers.conf: |
    [PARSER]
        Name regex
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>[^ ]+) (?<logtag>[^ ]+) (?<message>.+)$
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z
        Time_Keep On
        Decode_Field_As json message
  filters.conf: |
    [FILTER]
        Name parser
        Match *
        Key_Name log
        Parser regex
        Preserve_Key On
        Reserve_Data On
  output.conf: |
    [OUTPUT]
        Name cloudwatch
        Match *
        region eu-central-1
        log_group_name fluent-bit-cloudwatch
        log_stream_prefix fargate-
        auto_create_group On

I'm just wondering if we can prettify the log-stream name somehow. It would be great to set it to fargate.\<container-name>

lvthillo commented 3 years ago

@visit1985 You are probably looking for https://github.com/aws/containers-roadmap/issues/1197 to solve your last issue. I'm facing exactly the same issues (although we are not outputting to CloudWatch but Kinesis but this is unrelated). I will create a new cluster tomorrow to test if this issue also solves our problems!

wunzeco commented 3 years ago

@mikestef9 I have a newly created EKS Fargate cluster with v1.20 kubernetes version, and I'm hitting similar issue. My aim is to send JSON-formatted container logs to CloudWatch. Unfortunately, fluent-bit for EKS Fargate appears to wrap container logs in a string that has the format below.

"<timestamp> stdout F <container log>"

Hence, CloudWatch doesn't recognise logs from my containers as JSON-formatted logs but sees them as string E.g

2021-06-04T15:01:37.510999495Z stdout F {"@timestamp":"2021-06-04T15:01:37.508Z","@version":"1","message":"169.254.175.250 - - [2021-06-04T15:01:37.508Z] \"GET /ocean-integration/actuator/health HTTP/1.1\" 200 60","method":"GET","protocol":"HTTP/1.1","status":200,"requestedUrl":"GET /ocean-integration/actuator/health HTTP/1.1","uri":"/ocean-integration/actuator/health","remoteAddress":"169.254.175.250","contentLength":60,"requestTime":8,"requestHeaders":{"accept":"*/*","connection":"close","host":"10.1.2.30:8080","user-agent":"kube-probe/1.20+"},"responseHeaders":{"transfer-encoding":"chunked","connection":"close","date":"Fri, 04 Jun 2021 15:01:37 GMT","content-type":"application/vnd.spring-boot.actuator.v3+json"}}

My parser config is pretty simple:

kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
  labels:
data:
  parsers.conf: |
    [PARSER]
        Name docker
        Format json
        Time_Key time
  output.conf: |
    [OUTPUT]
        Name cloudwatch_logs
        Match   *
        region eu-west-1
        log_group_name fluent-bit-cloudwatch
        log_stream_prefix from-fluent-bit-
        auto_create_group true
        log_key log

I wonder if you can point me in the right direction please.

visit1985 commented 3 years ago

@wunzeco, you need to use a regex parser and the Decode_Field_As parameter as shown in my comment above.

wunzeco commented 3 years ago

@visit1985 UPDATED Many thanks for pointing me in the right direction. I can confirm that the config you suggested here (https://github.com/aws/containers-roadmap/issues/1203#issuecomment-768415703) worked as expected.

Interestingly, I noticed a difference in the output plugin in your config i.e. cloudwatch instead of cloudwatch_logs (which was what I used). So out of curiosity, I tried my previous config described above (https://github.com/aws/containers-roadmap/issues/1203#issuecomment-857733202) but replaced cloudwatch_logs plugin with cloudwatch. ~This worked too!~ Nope, it didn't work - logs were not being parsed correctly.

Next, I tried the working config (https://github.com/aws/containers-roadmap/issues/1203#issuecomment-768415703) with cloudwatch_logs plugin. Nope, cloudwatch_logs plugin didn't cut it!

I have no idea why cloudwatch_logs plugin behaved differently when it is meant to be equivalent to and more performant than cloudwatch plugin.

flomsk commented 3 years ago

Anyone was able to reach AWS ElasticSearch from EKS Fargate pod? With current setup I cant find any indices in ES. I have added policy to Fargate Execution Role

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "es:*",
            "Resource": "*"
        }
    ]
}

And next configmap I have

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
data:
  output.conf: |
    [OUTPUT]
        Name es
        Match *
        Host  vpc-***.us-east-1.es.amazonaws.com
        Port  443
        Index my_index
        AWS_Auth On
        AWS_Region <us-east-1>

On the pod annotations I also see that logging is enabled

Annotations:          CapacityProvisioned: 1vCPU 2GB
                      Logging: LoggingEnabled

What am I missing ?

Ankit05012019 commented 3 years ago

I am using EKS on AWS Fargate. I configured logging for the containers running on fargate using the built in log router. I configured it exactly as mentioned in the document

kind: Namespace
apiVersion: v1
metadata:
       name: aws-observability
       labels:
               aws-observability: enabled

--------------------------
kind: ConfigMap
apiVersion: v1
metadata:
  name: aws-logging
  namespace: aws-observability
  labels:
data:
  output.conf: |
    [OUTPUT]
        Name cloudwatch_logs
        Match   *
        region us-east-2
        log_group_name stage-server-app-cloudwatch
        log_stream_prefix from-fluent-bit-
        auto_create_group true

the configuration works fine initially, I see the log group created and logs are being pushed to cloudwatch . After a few hours it stops sending the logs to cloudwatch. I can see the container logs getting generated but events are not updated on the cloudwatch log stream. This happens every time , once i restart the pods new log streams are created and again the logging stops in a few hours . What am i missing here?

andreiseceavsp commented 3 years ago

I'm having the same issue as @Ankit05012019 . I also opened #1450 for this.

nectarine commented 3 years ago

Same issue as @flomsk . I wish I could get a fargate node fluentbit log 😢

Ankit05012019 commented 3 years ago

hey @andreiseceavsp i was able to get around this issue by changing the output plugin for fluentbit . Earlier i was using cloudwatch_logs plugin and now i have switched to cloudwatch plugin. Basically the configuration is now same as mention in this post - https://github.com/aws/containers-roadmap/issues/1203#issuecomment-768415703

hope that helps!!

andreiseceavsp commented 3 years ago

I changed today to cloudwatch plugin. It will take few days up until it reproduces sometimes. I appreciate the suggestion. I’ll come back with a feedback

nectarine commented 3 years ago

@flomsk If you use a Fine-grained access control for your es, make sure you add the pod execution role as the backend role for proper access. (for testing purpose, all_access could be a good start)

rghose commented 3 years ago

We're facing similar issues on AWS EKS with Kubernetes 1.20 :(

vaibhavkhunger commented 3 years ago

Amazon EKS on AWS Fargate now Supports the Fluent Bit Kubernetes Filter: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-eks-aws-fargate-supports-fluent-bit-kubernetes-filter/

You can find the technical documentation here: https://docs.aws.amazon.com/eks/latest/userguide/fargate-logging.html#fargate-logging-kubernetes-filter

bellondr commented 2 years ago

hi all, is there any best practices config example? log like this : "log": "2022-10-31T05:22:51.368004337Z stderr F 2022/10/31 05:22:51 [notice] 1#1: signal 3 (SIGQUIT) received, shutting down" make no sense we alway filter log by pod name or by deployment name

matheus-dati commented 1 month ago

@bellondr conseguiu descobrir algo sobre isso? ainda estou esperando algo, pois os logs vêm sem metadados.