aws / aws-for-fluent-bit

The source of the amazon/aws-for-fluent-bit container image
Apache License 2.0
461 stars 134 forks source link

Multiline log guidance #100

Closed HaroonSaid closed 1 year ago

HaroonSaid commented 4 years ago

We have the following configuration

{
      "essential": true,
      "name": "log_router",
      "firelensConfiguration": {
        "type": "fluentbit",
        "options": {
          "enable-ecs-log-metadata": "true"
        }
      }
      "memoryReservation": 50,
      "image": "906394416424.dkr.ecr.${AWS_REGION}.amazonaws.com/aws-for-fluent-bit:latest"
    },

We want to have multiline logs for stack trace etc.
How should I configure fluentbit

PettitWesley commented 3 years ago

So I might have made some progress on this...

Key Background information, FireLens uses the Fluentd Docker Log Driver to send your stdout logs to an auto-generated forward input in Fluent Bit: https://aws.amazon.com/blogs/containers/under-the-hood-firelens-for-amazon-ecs-tasks/

Thus, you can very easily simulate FireLens locally just by manually configuring the input and log driver. I did that, using the multiline example in the docs here: https://docs.fluentbit.io/manual/pipeline/filters/multiline-stacktrace

All of the details are here: https://github.com/fluent/fluent-bit/issues/4173

Basically, I can't get the multiline example working with the forward input... the example I have created is simple and the upstream maintainers can use it to repro my findings without requiring any access to AWS. We will make sure they take a look at this.

marksumm commented 3 years ago

@PettitWesley Thanks for looking at it.

Given the previous successes when using a second Fluent Bit instance in place of the fluentd Docker log driver, it seems that there is some inconsistency between the behaviour of the log driver and the Fluent Bit forward output.

PettitWesley commented 3 years ago

We seem to have found the root cause: https://github.com/fluent/fluent-bit/issues/4173#issuecomment-939576784

I will have a chat with the lead maintainer on this. However, it really looks like the multiline filter was not architected in a generic way that would actually work with all input plugins... so right now it is hard to say if the fix will be quick.

PettitWesley commented 3 years ago

Hey folks, some good news and some bad news.

First the bad news. I talked to the lead maintainer. Unfortunately, the new core multiline filter is not suitable for all use cases. I am bit dissatisfied with this; I think we all were expecting that the new multiline filter would be equivalent to the generic Fluentd multiline filter, and that it would work with ECS FireLens.

A warning has been added to the documentation that it does not solve all use cases: https://docs.fluentbit.io/manual/pipeline/filters/multiline-stacktrace

That warning definitely needs to be improved... so that users know which inputs the filter works with.

Essentially, the problem is that from a user experience POV, the generic filter is what folks want. But from an engineering standpoint, in order to be performant and work in all use cases, multiline parsing needs to happen on the input side. Only the input has full context on whether or not a set of logs all should be parsed together, and only in the inputs can we nicely buffer all those logs together.

This means the the long term or ideal solution for ECS FireLens customers will be:

  1. Implement multiline support in the Forward input (and others)
  2. Make it easier to set input parameters in FireLens. Right now, this is available, but via a hacky workaround: https://aws.amazon.com/blogs/containers/how-to-set-fluentd-and-fluent-bit-input-parameters-in-firelens/

Short Term Solution/The good news: Multiline Parsing in the AWS Outputs

When an output gets records, they come in a chunk. This chunk should be a set of records that were collected in order together. Which means we can do multiline parsing on them. This isn't perfect- we won't be able to concatenate multiline logs between chunks. But it's pretty good, and I think will unblock a lot of folks. So I went ahead and implemented it. I will get this into the upstream soon, but for now you can use my prototype. Please give me feedback on it.

Multiline support has been added in:

Multiline Output API

The API is the exact same as the new multiline API in tail or in the filter:

$ ./bin/fluent-bit  -o cloudwatch_logs -h
...
multiline.parser       specify one or multiple multiline parsers: docker, cri, go, java,
                       etc.
                       > default: default, type: multiple comma delimited strings

multiline.key_content  specify the key name that holds the content to process.
                       > default: default, type: string

Here's an example output configuration which worked for me:

[OUTPUT]
    name                  cloudwatch_logs
    match                 *
    region                us-west-2
    auto_create_group     true
    log_group_name        mutliline-test
    log_stream_prefix     test-
    multiline.key_content log
    multiline.parser      go, multiline-regex-test

Warnings

Using the prototype

You can see my code here: https://github.com/PettitWesley/fluent-bit/tree/aws_multiline_outputs_2_21_0

And anyone can pull the following image, which is built off of our latest release: https://github.com/aws/aws-for-fluent-bit/releases/tag/v2.21.0

144718711470.dkr.ecr.us-west-2.amazonaws.com/aws_output_multiline:2.21.0
ecs-cli pull --region us-west-2 --registry-id 144718711470  144718711470.dkr.ecr.us-west-2.amazonaws.com/aws_output_multiline:2.21.0

Please try it out and give me feedback.

PettitWesley commented 3 years ago

Hey folks, unfortunately the first version of that image contained a quite egregious memory leak. I've now fixed it and pushed an update to my branch and to the image. The new image SHA is: sha256:cf66c644d75d1897a664544d593e57ef444ee29d0797847454cfe19d1398aeec

tlamr commented 3 years ago

@PettitWesley I'm not sure whether we need some that general solution. e.g. for us (and I believe for anyone wanting to have cloud native app) it is all about that we print logs as jsons to stdout. And the only thing we need is that if cri splits the single log line into multiple there is a process that puts the parts together (see https://github.com/aws/aws-for-fluent-bit/issues/228) I believe this must be a way easier to implement, only in context of json to check for the end of parsable json data and my understanding is that it would fix most of the complaints people have.

I understand that anyone outputing something else that json is something they need to fix and the world needs to move on to logs as json everywhere :)

Would it be possible having just cri.multline parser with some option to try to read whole well formatted json data?

PettitWesley commented 3 years ago

@tlamr multiple fixes are needed to support all multiline log uses cases. The new feature in my comments here will help some folks. For your use case, you need the log tail multiline bug to be fixed. AWS is tracking that bug and has prioritized it as something we plan to work with the community to fix (hopefully) by the end of the year.

gautam-nutalapati commented 3 years ago

I am trying to send ECS logs to AWS OpenSearch via Copilot. In case below information helps:

I have two filters, one for multi-line and another for extracting fields. Logs are forwarded to OpenSearch. Field extraction works but multi-log parsing seems to fail. If I use exact same configuration on a file input stream (instead of running on ECS container), mutli-line parsing works as well.

custom-parsers.conf:

[MULTILINE_PARSER]
    name          multiline-regex-test
    type          regex
    flush_timeout 1000
    rule      "start_state"   "/^(INFO|DEBUG|ERROR|WARN|TRACE)(.*)$/"  "cont"
    rule      "cont"          "/^\s(.*)$/"                     "cont"

[PARSER]
    Name dummy_test
    Format regex
    Regex ^(?<loglevel>[(INFO |DEBUG|ERROR|WARN |TRACE)]*) \[(?<timestamp>[0-9]{4}\-[0-9]{1,2}\-[0-9]{1,2} [0-9]{1,2}\:[0-9]{1,2}\:[0-9]{1,2},[0-9]{1,3})\] \[(?<trace>[^ ]*)\] (?<log>(.|\s)*)$

fluent-bit.conf in ECS fluent-bit container:

[SERVICE]
    Parsers_File /fluent-bit/parsers/custom-parsers.conf
    Flush 1
    Grace 30

[FILTER]
    Name                  multiline
    Match                 *
    multiline.key_content log
    multiline.parser      multiline-regex-test

[FILTER]
    Name parser
    Match *
    Key_Name log
    Parser dummy_test

fluent-bit.conf to read file:

[SERVICE]
    Parsers_File /fluent-bit/parsers/custom-parsers.conf

[INPUT]
    name                  tail
    path                  /tmp/test-log.log
    read_from_head        true

[FILTER]
    Name                  multiline
    Match                 *
    multiline.key_content log
    multiline.parser      multiline-regex-test

[FILTER]
    Name parser
    Match *
    Key_Name log
    Parser dummy_test

[OUTPUT]
    Name stdout
    Match *

test-log.log

INFO  [2021-10-29 20:27:26,207] [trace-id-1] com.some.class: Some message 123
ERROR [2021-10-29 20:27:26,209] [trace-id-1] com.some.class: com.some.class.health(): Exception in check, Reason: A
ERROR [2021-10-29 20:27:26,210] [trace-id-1] com.some.class.health.AppHealthCheck: 
    com.some.class.health.AppHealthCheck.performHealthChecks(AppHealthCheck.java:58)
    com.some.class.health.BaseAppHealthCheck.healthCheckAspect(BaseAppHealthCheck.java:20)

parsed output of file:

[0] tail.0: [1635543540.958407180, {"loglevel"=>"INFO ", "timestamp"=>"2021-10-29 20:27:26,207", "trace"=>"trace-id-1", "log"=>"com.some.class: Some message 123"}]
[1] tail.0: [1635543540.958428419, {"loglevel"=>"ERROR", "timestamp"=>"2021-10-29 20:27:26,209", "trace"=>"trace-id-1", "log"=>"com.some.class: com.some.class.health(): Exception in check, Reason: A"}]
[2] tail.0: [1635543540.958428638, {"loglevel"=>"ERROR", "timestamp"=>"2021-10-29 20:27:26,210", "trace"=>"trace-id-1", "log"=>"com.some.class.health.AppHealthCheck:
    com.some.class.health.AppHealthCheck.performHealthChecks(AppHealthCheck.java:58)
    com.some.class.health.BaseAppHealthCheck.healthCheckAspect(BaseAppHealthCheck.java:20)"}]
PettitWesley commented 3 years ago

@gautam-nutalapati The Multiline filter is entirely broken for most use cases. It only really works with tail, and if you use tail input you should use the multiline API in that plugin instead of the filter. Please see my last few comments in this issue, or check here: https://github.com/fluent/fluent-bit/issues/4173#issuecomment-939576784

I have discussed a plan/design to fix the multiline filter with the lead Fluent Bit maintainer and I am actively working on this. I expect to have a pre-release version for folks to test within 2 weeks.

PettitWesley commented 3 years ago

Hey folks, good news. I came up with a better plan than the output multiline settings in one of my previous comments.

I figured out how to fix the multiline filter. Basically, it needs to function more like an input, that buffers logs, processes them, and then emits them into the pipeline. It now works more like the rewrite_tag filter, where it buffers and then re-emits records at the beginning of Fluent Bit's log pipeline. Unlike rewrite tag though, it will re-emit records with the same tag and then will intelligently not process them again when they pass through the filter a second time. That part might need to be changed- if you have multiple filters the multiline records will pass through each of them twice- once as split records, once as concatenated multilines.

I have discussed this design with the lead Fluent Bit maintainer and he says he will accept this contribution.

Right now, it's not quite ready for merge, but I do have a working prototype to share. Please test it out and give me feedback.

Filter Prototype Code

Code can be found here: https://github.com/PettitWesley/fluent-bit/tree/multiline-proto-2_21_1

Filter Prototype Image

I built an image with my changes on top of AWS for Fluent Bit 2.21.1:

144718711470.dkr.ecr.us-west-2.amazonaws.com/multiline-filter-prototype:2.21.1

Filter Configuration Options

There is a new flush_ms for the buffering of messages, and there are some new config options for the emitter.

$ ./bin/fluent-bit -F multiline -h
Fluent Bit v1.8.9
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

HELP
multiline filter plugin

DESCRIPTION
Concatenate multiline messages

OPTIONS
debug_flush            enable debugging for concatenation flush to stdout
                       > default: false, type: boolean

flush_ms               Flush time for pending multiline records
                       > default: 2000, type: integer

multiline.parser       specify one or multiple multiline parsers: docker, cri, go, java,
                       etc.
                       > default: default, type: multiple comma delimited strings

multiline.key_content  specify the key name that holds the content to process.
                       > default: default, type: string

emitter_name
                       > default: default, type: string

emitter_storage.type
                       > default: memory, type: string

emitter_mem_buf_limit  set a memory buffer limit to restrict memory usage of emitter
                       > default: 10M, type: size
kkamath-bandwidth commented 3 years ago

I read the thread, and just want to confirm that ECS logs > ~16K characters shipped to 3rd parties (like DataDog/SumoLogic) are expected to consistently fail parsing, and there is no configuration or firelens image I can build to rectify this, correct?

PS: The multilines are handled fine. Also the same logs with >16K characters parse fine locally with fluent-bit.

PettitWesley commented 3 years ago

Hey folks, I have posted the design document for the re-design of the generic filter upstream: https://github.com/fluent/fluent-bit/issues/4309

@kkamath-bandwidth Yes, right now there is no way with Fluent Bit to concatenate logs split at 16KB by the runtime in ECS FireLens. This is part of the design shared above.

karthikeayan commented 3 years ago

@PettitWesley I tried your docker image, multiline parser working. I see three different log entries, hope this is the expected behaviour.

I hope these 3 entries are from single java strack trace, 1.

16:18:10.831 [HikariPool-1 housekeeper] DEBUG com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Pool stats (total=15, active=0, idle=15, waiting=0)

2.

    at org.apache.catalina.session.StandardSession.getAttribute(StandardSession.java:1178) [catalina.jar:8.5.16]
    at org.apache.catalina.session.StandardSessionFacade.getAttribute(StandardSessionFacade.java:103) ~[catalina.jar:8.5.16]
    at com.mycompany.mymodule.listeners.MySessionListener.attributeRemoved(MySessionListener.java:104) ~[myapp-4.0.jar:na]
    at org.apache.catalina.session.StandardSession.removeAttributeInternal(StandardSession.java:1839) [catalina.jar:8.5.16]
    at org.apache.catalina.session.StandardSession.expire(StandardSession.java:879) [catalina.jar:8.5.16]
    at org.apache.catalina.session.StandardSession.isValid(StandardSession.java:682) [catalina.jar:8.5.16]
    at org.apache.catalina.session.ManagerBase.processExpires(ManagerBase.java:552) [catalina.jar:8.5.16]
    at org.apache.catalina.session.ManagerBase.backgroundProcess(ManagerBase.java:537) [catalina.jar:8.5.16]
    at org.apache.catalina.core.StandardContext.backgroundProcess(StandardContext.java:5572) [catalina.jar:8.5.16]
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1388) [catalina.jar:8.5.16]
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1392) [catalina.jar:8.5.16]
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1392) [catalina.jar:8.5.16]
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1360) [catalina.jar:8.5.16]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_265]

3.

java.lang.IllegalStateException: getAttribute: Session already invalidated

My multiline-parser configuration:

[MULTILINE_PARSER]
    name          multiline-regex-test
    type          regex
    flush_timeout 1000
    rule          "start_state"   "/^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}|^\d{2}:\d{2}:\d{2}.\d{3}|^\d{2}-[a-zA-Z]{3}-\d{4} \d{2}:\d{2}:\d{2}.\d{3})(.*)$/"  "cont"
    rule          "cont"          "/^\s(.*)$/" "cont"
PettitWesley commented 3 years ago

@karthikeayan Given that log #2 has multiple lines in one event, it looks like the parser is doing something... I think something must be wrong with your multiline regexes though, and that's why the start of the stacktrace isn't also concatenated in there too.

Unfortunately debugging these can often be hard. Use rubular to text regex's: https://rubular.com/

karthikeayan commented 3 years ago

@PettitWesley Thank you, unfortunately, we noticed some logs were missing when using the patch image. Till we figure out what exactly is the issue, will be using the stable firelens image.

andrewhoplife commented 3 years ago

I am seeking support for how to implement multiline for this project implemented in CDK. I have the helm chart defined but I can find no documentation from anywhere else on Amazon or the web for this issue and none of the autofilled options in VSCode seem to be logical. Thanks in advance!

    new HelmChart(this, 'CloudwatchFluentBitChart', {
      cluster: this.cluster,
      release: 'aws-for-fluent-bit',
      chart: 'aws-for-fluent-bit',
      repository: 'https://aws.github.io/eks-charts',
      namespace: 'eks',
      values: {
        kinesis: {
          enabled: false
        },
        elasticsearch: {
          enabled: false
        },
        firehose: {
          enabled: false
        },
        cloudWatch: {
          logRetentionDays: RetentionDays.TWO_WEEKS,
          logGroupName: "/aws/eks/logs/$(kubernetes['container_name'])",
          logKey: 'log'
        }
      }
    }).node.addDependency(namespaceEks);
PettitWesley commented 3 years ago

@andrewhoplife For EKS, you want the multiline support in tail: https://docs.fluentbit.io/manual/pipeline/inputs/tail#multiline-support

This example looks correct to me, you want to append additional custom parsers to the list: https://github.com/fluent/helm-charts/blob/main/charts/fluent-bit/values.yaml#L214

To get specific help with helm or CDK, this is not the right repo. Please open an issue on the CDK or AWS helm chart repos.

jentcain commented 2 years ago

@PettitWesley I tried your docker image 144718711470.dkr.ecr.us-west-2.amazonaws.com/multiline-filter-prototype:2.21.1 and was able to see multiline logs in a single log event in cloudwatch where previously there were multiple. See below message which used to appear as 5 events. Note there is no line delimiter appearing in cloudwatch.

WARNING: An illegal reflective access operation has occurredWARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operationsWARNING: All illegal access operations will be denied in a future release

damianomagro commented 2 years ago

@PettitWesley I tried your docker image too and multiline filter is now working (we are using datadog output). How long will we have to wait for a stable release?

PettitWesley commented 2 years ago

Buffered Mode is now released: https://github.com/aws/aws-for-fluent-bit/releases/tag/v2.22.0

(Buffered mode is the default... and it basically just means that now the filter "works as expected").

For some reason the docs were only merged for 1.9 pre-release page, but its in our latest release: https://docs.fluentbit.io/manual/v/1.9-pre/pipeline/filters/multiline-stacktrace

magg commented 2 years ago

@PettitWesley can you share how to configure buffer mode in fluent-bit with multiline support? I know the aws image already has it preconfigured. According to this thread and your link I should enabled buffer mode, but the official fluent bit documentation does not mention what are the valid values for buffer is it on/off or true/false?

[MULTILINE_PARSER]
    name          multiline-regex-test
    type          regex
    flush_timeout 1000
    Time_Key time
    Time_Format %d-%m-%y %H:%M:%S.%L
    #
    # Regex rules for multiline parsing
    # ---------------------------------
    #
    # configuration hints:
    #
    #  - first state always has the name: start_state
    #  - every field in the rule must be inside double quotes
    #
    # rules |   state name  | regex pattern                  | next state
    # ------|---------------|--------------------------------------------
    rule      "start_state"   "/^\[(?<time>[^\]]*)\] \[(?<source>[^\]]*)\] \[(?<level>[^\]]*)\] \[(?<uuid>[^\]]*)\] (?<log>.+)$/"  "cont"
    rule      "cont"          "/^\s+at.*/"                     "cont"
[SERVICE]
    HTTP_Server  On
    Parsers_File fluent-bit-parsers.conf
    Flush 5
[INPUT]
    Name forward
[FILTER]
    Match *
    Name parser
    multiline.key_content log
    multiline.parser      java, multiline-regex-test
    buffer on

Do you think this configuration will work?

7nwwrkdnakht3 commented 2 years ago

I have a very strange issue.

  1. I run a ECS container with supervisord as a root user. Supervisord calls fluentbit. This command ships logs to s3 and logzio. Having tested the multiline configuration in stdout locally it works fine. However the fluentbit command does not work as the initial command.
  2. Logging into ECS and executing the same command without altering configuration files makes multiline work.
PettitWesley commented 2 years ago

@magg

valid values for buffer is it on/off or true/false?

On/off/true/false case insensitive are all supported for Fluent Bit bool config values.

Your configuration looks correct to me. Looks like my doc changes only got deployed for 1.9 pre release: https://docs.fluentbit.io/manual/v/1.9-pre/pipeline/filters/multiline-stacktrace

@7nwwrkdnakht3 Can you please open a separate issue to troubleshoot your problem and please provide us with more details like the FB config.

magg commented 2 years ago

hi @PettitWesley I have succesfully tested with both images (the aws image and the fluent bit one) the multiline parser with ECS containers in fargate to send logs to OpenSearch

but the downside is when I used a regular parser I was able to match fields I declared on my regex (i.e. time, source, level, uuid) and see them on OpenSearch

this regular parser uses the same regex as my multiline parser on my comment above

[PARSER]
    Name main_parser
    Format regex
    Regex ^\[(?<time>[^\]]*)\] \[(?<source>[^\]]*)\] \[(?<level>[^\]]*)\] \[(?<uuid>[^\]]*)\] (?<log>.+)$
    Time_Key time
    Time_Format %d-%m-%y %H:%M:%S.%L

results into

{
  "_index": "okay-logs-2022.03.06",
  "_type": "_doc",
  "_id": "Pw2JXX8BrnHC10AfjuOW",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2022-03-06T04:42:49.699Z",
    "source": "multi-tenant-gateway",
    "level": "INFO",
    "uuid": "4a12f88f-2f85-44aa-aaa6-81998551da82",
    "log": "[http-nio-9080-exec-2] c.p.g.s.t.i.AsyncTenantCallbackService - Posting message to Tenant#10000."
  },
  "fields": {
    "@timestamp": [
      "2022-03-06T04:42:49.699Z"
    ]
  },
  "sort": [
    1646541769699
  ]
}

but now with multiline parsers I get this on OpenSearch

{
  "_index": "okay-logs-2022.03.11",
  "_type": "_doc",
  "_id": "Ww1TeX8BrnHC10AfP-zR",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2022-03-11T14:12:49.000Z",
    "level": "INFO",
    "logger": "com.protectoria.gateway.service.tenant.impl.AsyncTenantCallbackService",
    "thread": "http-nio-9080-exec-6",
    "message": "[11-03-22 14:12:49.703] [multi-tenant-gateway] [INFO] [bb973f29-3954-4d94-b7b4-d477f046c3a9] [http-nio-9080-exec-6] c.p.g.s.t.i.AsyncTenantCallbackService - Posting message to Tenant#10000.\n"
  },
  "fields": {
    "@timestamp": [
      "2022-03-11T14:12:49.000Z"
    ]
  },
  "sort": [
    1647007969000
  ]
}

Do you know if matching is not enabled on multiline parsers or why my configuration can't match the fields I put on my regex

PettitWesley commented 2 years ago

@magg I think your use case is for parsing the log to split out the data into separate fields? That's a use case for the "norma'" parsers and the normal filter parser. I think this is not a use case for multiline- multiline is for joining multiple events into one event.

Also, you can use both. You can have two filters. You can concat events together with multiline first, and then split out fields from the log using normal parser, or the reverse.

magg commented 2 years ago

@PettitWesley yes I get the idea, I'm able to see the full java stack trace thanks the to multiline line parser.

But all my log lines including the first one that contain the stacktrace have the format [timestamp][service][level][thread][class] I wanted extract those into fields for a better experience on OpenSearch. I wanted both options.

Do you have any example of using both filters? I cannot find it anywhere :(

trallnag commented 2 years ago

So from what I understand it is currently not possible to use the input plugin tail with multiline.parser cri at the same time with the filter plugin multiline?

I'm using AWS EKS and I want to merge Java stack trace records into a single log record in CloudWatch, but I'm failing to do so.

[INPUT]
  name              tail
  tag               application.*
  path              /var/log/containers/*.log
  multiline.parser  cri
  skip_long_lines   on
  refresh_interval  10
  rotate_wait       30
  db                /var/fluent-bit/state/input-kube.db
  db.locking        true
  storage.type      filesystem
  mem_buf_limit     32MB

[FILTER]
  name                 kubernetes
  match                application.*
  kube_tag_prefix      application.var.log.containers.
  merge_log            on
  keep_log             off
  k8s-logging.parser   on
  k8s-logging.exclude  on
  labels               on
  annotations          off
  use_kubelet          true
  buffer_size          0
PettitWesley commented 2 years ago

@magg

Using two filters is just a matter of have two filter definitions like:


[FILTER]
     Name multiline
     Match *
     ....

[FILTER]
     Name parser
     Match *
     ....
PettitWesley commented 2 years ago

@trallnag

So from what I understand it is currently not possible to use the input plugin tail with multiline.parser cri at the same time with the filter plugin multiline?

You should be able to use both the tail multiline functionality and also the filter. You shouldn't use the same multiline parser with each, use a different parser, but it should work.

trallnag commented 2 years ago

Yes! It works! Thanks a lot @PettitWesley, no idea what I did wrong last time I tried it. I will add my config here, maybe it can help someone in the future

Logs before getting it to work:

{"log": "com.netflix.discovery.shared.transport.TransportException: Cannot execute request on any known server"}
{"log": "\tat com.netflix.discovery.shared.transport.decorator.RetryableEurekaHttpClient.execute(RetryableEurekaHttpClient.java:112) ~[eureka-client-1.9.17.jar!/:1.9.17]"}

Now I add the multiline filter to my config:

[INPUT]
    name              tail
    tag               kube.*
    path              /var/log/containers/*.log
    multiline.parser  cri
    skip_long_lines   on
    refresh_interval  10
    rotate_wait       30
    db                /var/fluent-bit/state/input-kube.db
    db.locking        true
    storage.type      filesystem
    mem_buf_limit     32MB
    buffer_max_size   128k

[FILTER]
    name multiline
    match kube.*
    multiline.parser java
    multiline.key_content log

And it merges the stack traces. Here is a screenshot from CloudWatch:

image

Relevant line from fluent-bit logs:

[2022/03/31 21:59:02] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_kube.var.log.containers[...]

Edit: As a "micro-optimization" you can also set buffer off on the multiline filter as it is not necessary for tail input.

PettitWesley commented 2 years ago

Note please see: https://github.com/fluent/fluent-bit/issues/5235

If you have more than one multiline filter definition and they match the same records, it can cause all sorts of trouble. I am trying to figure out how to fix this.

tatsuo48 commented 2 years ago

I want to combine docker logs.

https://docs.fluentbit.io/manual/v/1.9-pre/pipeline/filters/multiline-stacktrace

If you wish to concatenate messages read from a log file, it is highly recommended to use the multiline support in the Tail plugin itself. This is because performing concatenation while reading the log file is more performant. Concatenating messages originally split by Docker or CRI container engines, is supported in the Tail plugin.

Because of this caution, I want to write multiline.parser docker in INPUT, but my INPUT is forward. Is there any effort to make multiline.parser docker available in forward INPUT?

PettitWesley commented 2 years ago

@tatsuo48 that statement only applies to tail, because tail gets the logs all at once in chunks read from the file and so it's most efficient to do the multiline processing there. For forward plugin, based on my understanding of the underlying code, there shouldn't really be much of a difference between implementing buffering and multiline concat directly in forward vs in my filter. The way filters actually work is that they are sort of like run in the same context as the input, they're sort of like extensions attached to an input. If that makes sense/helps. So please just use the filter :)

tatsuo48 commented 2 years ago

@PettitWesley

that statement only applies to tail, because tail gets the logs all at once in chunks read from the file and so it's most efficient to do the multiline processing there.

I understand what you are saying. I will try it with FILTER. thank you!

James96315 commented 2 years ago

path /var/log/containers/*.log multiline.parser cri skip_long_lines on refresh_interval 10 rotate_wait 30

Yes! It works! Thanks a lot @PettitWesley, no idea what I did wrong last time I tried it. I will add my config here, maybe it can help someone in the future

Logs before getting it to work:

{"log": "com.netflix.discovery.shared.transport.TransportException: Cannot execute request on any known server"}
{"log": "\tat com.netflix.discovery.shared.transport.decorator.RetryableEurekaHttpClient.execute(RetryableEurekaHttpClient.java:112) ~[eureka-client-1.9.17.jar!/:1.9.17]"}

Now I add the multiline filter to my config:

[INPUT]
    name              tail
    tag               kube.*
    path              /var/log/containers/*.log
    multiline.parser  cri
    skip_long_lines   on
    refresh_interval  10
    rotate_wait       30
    db                /var/fluent-bit/state/input-kube.db
    db.locking        true
    storage.type      filesystem
    mem_buf_limit     32MB
    buffer_max_size   128k

[FILTER]
    name multiline
    match kube.*
    multiline.parser java
    multiline.key_content log

And it merges the stack traces. Here is a screenshot from CloudWatch:

image

Relevant line from fluent-bit logs:

[2022/03/31 21:59:02] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_kube.var.log.containers[...]

Edit: As a "micro-optimization" you can also set buffer off on the multiline filter as it is not necessary for tail input.

if you set "Path_Key" in input, can you still collect logs?

trallnag commented 2 years ago

Btw I'm having issues with multiline atm:


@James96315, I don't see why it shouldn't work. But I haven't tried it.

PettitWesley commented 2 years ago

@James96315 yea Path_Key should work

James96315 commented 2 years ago

multiline.parser java multiline.key_content log

I use "amazon/aws-for-fluent-bit:2.24.0", if I add "path_key" in "[INPUT]", it can't work. I got some info from fluent bit log:

2022-05-25T14:04:37.249061747Z stderr F [2022/05/25 14:04:37] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/app-spring-boot-demo-6d5dbf7b55-zm5jm_spring-boot-ns_app-spring-boot-demo-4178b6d80bd2adaeee57038e894318cbb7ba81971a75bd3ccc
d861eca856baa6.log, inode 31464022
2022-05-25T14:04:37.249067397Z stderr F [2022/05/25 14:04:37] [debug] [input:tail:tail.0] 0 new files found on path '/var/log/containers/app-spring-boot-demo*'
2022-05-25T14:04:40.544133604Z stderr F [2022/05/25 14:04:40] [debug] [input:tail:tail.0] inode=31464022 events: IN_MODIFY
2022-05-25T14:04:40.54419313Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545447254Z stderr F [2022/05/25 14:04:40] [debug] [input:tail:tail.0] inode=31464022 events: IN_MODIFY
2022-05-25T14:04:40.545687975Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545724571Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545737097Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545739314Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545741563Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545768163Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545776386Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545779263Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes
2022-05-25T14:04:40.545788355Z stderr F [2022/05/25 14:04:40] [debug] [input chunk] skip ingesting data with 0 bytes

Here is my config:

  fluent-bit.conf: |
    [SERVICE]
        Flush                       5
        Daemon                      off
        Log_level                   Debug
        Http_server                 On
        Http_listen                 0.0.0.0
        Http_port                   2022
        Parsers_File                parsers.conf
        storage.path                /var/fluent-bit/state/flb-storage/
        storage.sync                normal
        storage.checksum            Off
        storage.backlog.mem_limit   5M

    [INPUT]
        Name                tail
        Tag                 kube.var.log.containers.spring-boot.*
        Path                /var/log/containers/app-spring-boot-demo*
        Path_Key            file_name
        Skip_Long_Lines     On
        multiline.parser    cri
        DB                  /var/fluent-bit/state/flb_container-spring-boot.db
        DB.locking          true
        Docker_Mode         Off

        Mem_Buf_Limit       50MB
        Buffer_max_size     64K
        Refresh_Interval    10
        Rotate_Wait         30
        Storage.type        filesystem
        Read_from_Head      True

    [FILTER]
        Name                    multiline
        Match                   kube.var.log.containers.spring-boot.*
        multiline.key_content   log
        Multiline.parser        java

    [OUTPUT]
        Name                kinesis_streams
        Match               kube.var.log.containers.spring-boot.*
        Region              ap-south-1
        Stream              kds-spring-boot
        Retry_Limit         False

The sample log is as follows:

2022-05-23T16:01:30.941156659Z stdout F 2022-05-23 16:01:30.940 INFO  [http-nio-0.0.0.0-8080-exec-6] com.demo.petstore.PetstoreApplication : nginx forward
2022-05-23T16:01:30.944619148Z stdout F 2022-05-23 16:01:30.943 ERROR [http-nio-0.0.0.0-8080-exec-6] com.demo.petstore.PetstoreApplication : hello processing failed
2022-05-23T16:01:30.944654786Z stdout F java.lang.RuntimeException: bad request
2022-05-23T16:01:30.94465975Z stdout F  at com.demo.petstore.PetstoreApplication.hello(PetstoreApplication.java:24)
2022-05-23T16:01:30.944663547Z stdout F         at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2022-05-23T16:01:30.944666992Z stdout F         at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2022-05-23T16:01:30.944673364Z stdout F         at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2022-05-23T16:01:30.944677207Z stdout F         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2022-05-23T16:01:30.94468074Z stdout F  at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
2022-05-23T16:01:30.944683631Z stdout F         at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150)
2022-05-23T16:01:30.944687277Z stdout F         at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117)
2022-05-23T16:01:30.944699432Z stdout F         at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895)
2022-05-23T16:01:30.944702574Z stdout F         at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)

If the parsing is correct, the number of output log records should be 2, but it has not been successful

PettitWesley commented 2 years ago

@James96315 Thanks for the report, just to be clear, when you remove Path_Key, you see these records concatenated? Can you share an example of concatenated/unconcatenated results please.

James96315 commented 2 years ago

concatenated

  1. If I remove "Path_Key", I also have to set "Buffer False", otherwise the pod will crash.
    [FILTER]
        Name                    multiline
        Match                   kube.var.log.containers.spring-boot.*
        multiline.key_content   log
        Multiline.parser        java
       Buffer                     False

    2.Even if I set the buffer to False and remove the Path_Key, the log parsing is not right. The number of parsed records is sometimes 3, sometimes 5, which seem to be random. The logs keep getting split and not parsed correctly. 3.The correct parsing result should be as follows, just 2 records.

    
    2022-05-23 16:01:30.940 INFO  [http-nio-0.0.0.0-8080-exec-6] com.demo.petstore.PetstoreApplication : nginx forward

2022-05-23 16:01:30.943 ERROR [http-nio-0.0.0.0-8080-exec-6] com.demo.petstore.PetstoreApplication : hello processing failed java.lang.RuntimeException: bad request at com.demo.petstore.PetstoreApplication.hello(PetstoreApplication.java:24) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)

dschaaff commented 2 years ago

https://github.com/fluent/fluent-bit/issues/5323#issuecomment-1138893372

clouddev-code commented 2 years ago

Is it possible to pass first through the multiline filter and then through the rewrite_tag filter? It looks like the mutliline stream is running twice from fluentbit container.

[2022/08/30 03:03:28] [ info] [filter:multiline:multiline.1] created new multiline stream for forward.1_app-firelens-0955bd55b48543688b97f84fd50df999
[2022/08/30 03:03:38] [ info] [filter:multiline:multiline.1] created new multiline stream for emitter.7_application_app-firelens-0955bd55b48543688b97f84fd50df999

Fluentbit version 2.24.0

[FILTER]
   name                  multiline
   match                 *
   multiline.key_content log
   multiline.parser      multiline-regex-test

[FILTER]
   Name rewrite_tag
   Match *-firelens-*
   Rule  $log ^(.*\[access\]).*$ access_$TAG false
PettitWesley commented 2 years ago

@clouddev-code https://github.com/aws/aws-for-fluent-bit/blob/mainline/troubleshooting/debugging.md#rewrite_tag-filter-and-cycles-in-the-log-pipeline

Rewrite tag moves data to head of the pipeline, so it moves through the filter again with new tag.

TWCM commented 1 year ago

Hi, I'm looking for a guide of multiline for eks fargate logging. Here is the official documentation .

Is it possible to apply the mutiline parser in the built-in log router, I checked the document, and it's not allowed to add a multiline filter in the filter.config. Here is the content mentioned in the doc,

When creating the ConfigMap, take into account the following rules that Fargate uses to validate fields:

[FILTER], [OUTPUT], and [PARSER] are supposed to be specified under each corresponding key. For example, [FILTER] must be under filters.conf. You can have one or more [FILTER]s under filters.conf. The [OUTPUT] and [PARSER] sections should also be under their corresponding keys. By specifying multiple [OUTPUT] sections, you can route your logs to different destinations at the same time.

Fargate validates the required keys for each section. Name and match are required for each [FILTER] and [OUTPUT]. Name and format are required for each [PARSER]. The keys are case-insensitive.

Environment variables such as ${ENV_VAR} aren't allowed in the ConfigMap.

The indentation has to be the same for either directive or key-value pair within each filters.conf, output.conf, and parsers.conf. Key-value pairs have to be indented more than directives.

Fargate validates against the following supported filters: grep, parser, record_modifier, rewrite_tag, throttle, nest, modify, and kubernetes.

Fargate validates against the following supported output: es, firehose, kinesis_firehose, cloudwatch, cloudwatch_logs, and kinesis.

At least one supported Output plugin has to be provided in the ConfigMap to enable logging. Filter and Parser aren't required to enable logging.

PettitWesley commented 1 year ago

@TWCM Supporting custom multiline parsers on EKS Fargate would be a feature request, can you please open a request here for that: https://github.com/aws/containers-roadmap/issues

And @ mention me in it.

PettitWesley commented 1 year ago

I'm going to close this issue as it is very old and the full multiline support was launched last year. Please open a new issue for new multiple issues or requests.

svrviny1324 commented 1 year ago

@PettitWesley hi here iam trying to use multiline parser and trying to merge logs which are related to same pod below is my log formate before using multiline parser i can view logs in cloudwatch in below formate

{ "log": "2023-04-28T09:42:39.72883634Z stderr F [2023/04/28 09:42:39] [ info] [input:tail:tail.2] multiline core started", "kubernetes": { "pod_name": "fca-de-green-kafka-consumer-offset-resetter-6cf9856b8-5jffb", "namespace_name": "", "pod_id": "", "host": "", "container_name": "resetter", "docker_id": "", "container_hash": "", "container_image": "" } } { "log": "2023-04-28T09:42:39.729908443Z stderr F [2023/04/28 09:42:39] [ info] [input:systemd:systemd.3] seek_cursor=s=7f2279a6a5d640418ee14fca72a59e8a;i=f38... OK", "kubernetes": { "pod_name": "fca-de-green-kafka-consumer-offset-resetter-6cf9856b8-5jffb", "namespace_name": "", "pod_id": "", "host": "", "container_name": "resetter", "docker_id": "", "container_hash": "", "container_image": "" } }

here i have modified my config by adding filter multiline to merge logs configmap: [SERVICE] Flush 2 Log_Level info Daemon Off Parsers_File parsers.conf Parsers_File custom_parsers.conf HTTP_Server On HTTP_Listen 0.0.0.0 HTTP_Port 2020

[INPUT]
    Name                tail
    Tag                 application.*
    Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
    Path                /var/log/containers/*.log
    Docker_Mode         On
    Docker_Mode_Flush   5
    Docker_Mode_Parser  container_firstline
    #Parser              docker
    multiline.parser    docker
    DB                  /var/fluent-bit/state/flb_container.db
    Mem_Buf_Limit       50MB
    Skip_Long_Lines     On
    Refresh_Interval    10
    Rotate_Wait         30
    Read_from_Head      Off

[INPUT]
    Name                tail
    Tag                 application.*
    Path                /var/log/containers/fluent-bit*
    #Parser              docker
    multiline.parser    docker
    DB                  /var/fluent-bit/state/flb_log.db
    Mem_Buf_Limit       5MB
    Skip_Long_Lines     On
    Refresh_Interval    10
    Read_from_Head      Off

[INPUT]
    Name                tail
    Tag                 application.*
    Path                /var/log/containers/cloudwatch-agent*
    Docker_Mode         On
    Docker_Mode_Flush   5
    Docker_Mode_Parser  cwagent_firstline
    #Parser              docker
    multiline.parser    docker
    DB                  /var/fluent-bit/state/flb_cwagent.db
    rotate_wait         15
    Mem_Buf_Limit       15MB
    Skip_Long_Lines     On
    Refresh_Interval    25
    #Read_from_Head      true

[FILTER]
    Name                kubernetes
    Match               application.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Kube_Tag_Prefix     application.var.log.containers.
    Merge_Log           On
    Merge_Log_Trim      Off
    Merge_Log_Key      log
    K8S-Logging.Parser  On
    K8S-Logging.Exclude Off
    Labels              Off
    Annotations         Off

  [FILTER]
    name                  multiline
    match                 application.*
    multiline.key_content log
    multiline.parser      docker

     [OUTPUT]
    Name                cloudwatch_logs
    Match               application.*
    region              ${REGION}
    log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
    log_stream_prefix   ${APP_POD_NAMESPACE}-
    auto_create_group   true
    extra_user_agent    container-insights

custom_parsers.conf: | [PARSER] Name docker Format json Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%LZ

     [PARSER]
    Name                cwagent_firstline
    Format              regex
    Regex               (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).?)(?<!\\)".(?<stream>(?<="stream":").?)".(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
    Time_Key            time
    Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

    ---------- logs got merged but here i want to append /n between log to log ,how i can do it please can you help for this @PettitWesley 

    After using multiline
    {

"log": "2023-04-27T10:13:27.942291886Z stdout F Kafka Mirror Maker consumer configuration:2023-04-27T10:13:27.950593993Z stdout F # Bootstrap servers2023-04-27T10:13:27.950616934Z stdout F bootstrap.servers=adept-ingress-kafka-bootstrap.adept:90922023-04-27T10:13:27.950621654Z stdout F # Consumer group2023-04-27T10:13:27.950625174Z stdout F group.id=adept-rtti-mirrormaker2023-04-27T10:13:27.950628244Z stdout F # Provided configuration2023-04-27T10:13:27.950631164Z stdout F auto.commit.interval.ms=30002023-04-27T10:13:27.950633984Z stdout F auto.offset.reset=latest2023-04-27T10:13:27.950636494Z stdout F enable.auto.commit=true2023-04-27T10:13:27.950638424Z stdout F isolation.level=read_committed2023-04-27T10:13:27.950640334Z stdout F 2023-04-27T10:13:27.950642154Z stdout F 2023-04-27T10:13:27.950644074Z stdout F security.protocol=PLAINTEXT2023-04-27T10:13:27.950645995Z stdout F 2023-04-27T10:13:27.950647745Z stdout F 2023-04-27T10:13:27.950649485Z stdout F 2023-04-27T10:13:27.950730916Z stdout F 2023-04-27T10:13:27.950740377Z stdout F Kafka Mirror Maker producer configuration:2023-04-27T10:13:27.957949969Z stdout F # Bootstrap servers2023-04-27T10:13:27.95796353Z stdout F bootstrap.servers=rtti-ingress-kafka-bootstrap:90922023-04-27T10:13:27.95796602Z stdout F # Provided configuration2023-04-27T10:13:27.95796868Z stdout F batch.size=327682023-04-27T10:13:27.95797073Z stdout F compression.type=snappy2023-04-27T10:13:27.95797277Z stdout F linger.ms=1002023-04-27T10:13:27.95797469Z stdout F 2023-04-27T10:13:27.95797654Z stdout F 2023-04-27T10:13:27.95797891Z stdout F security.protocol=PLAINTEXT2023-04-27T10:13:27.9579808Z stdout F 2023-04-27T10:13:27.95798263Z stdout F 2023-04-27T10:13:27.95798444Z stdout F 2023-04-27T10:13:27.958075052Z stdout F 2023-04-27T10:13:33.105771564Z stdout F 2023-04-27 10:13:33,105 INFO Starting readiness poller (io.strimzi.mirrormaker.agent.MirrorMakerAgent) [main]", "kubernetes": { "pod_name": "green-mirror-maker-5df8d48f89-6ngsw", "namespace_name": "realtimetraffic", "pod_id": "", "host": "", "container_name": "green-mirror-maker", "docker_id": "", "container_hash": "", "container_image": "" } }