Closed HaroonSaid closed 1 year ago
Fluent Bit unfortunately does not yet have generic multiline logging support that can be used with FireLens. We are planning to work on it. For now, you must use Fluentd: https://github.com/aws-samples/amazon-ecs-firelens-examples/tree/mainline/examples/fluentd/multiline-logs
@zhonghui12 , @PettitWesley we are using firelens configuration with aws-for-fluent-bit for multi-destination log routing which includes cloudwatch as one of the sources. Multiline log grouping we need to make the most out of our logs. whether is any custom parser we can use to achieve it also fine.
@belangovan there has been no change in guidance since my last comment on this issue. Fluent Bit still only have multiline support that works when tailing a log file. It does not have generic multiline support that works with FireLens. We are planning to work on that some time in the next few months. Until then, you have to use Fluentd for multiline.
Is this feature you guys are planning on working soon?
Just want to know on how to plan for our organization
Do we switch to fluentd or wait. If we wait - how long
@HaroonSaid We have begun investigation for this project. We hope to get it launched within 2 months, however, there are no guarantees.
@PettitWesley Any updates re: whether this project is launching as intended? Debating whether we have to change an internal logging system to support fluentd or if we can wait for fluentbit multiline support to land.
@corleyma The upstream maintainers are working on it apparently- I've been told that it should be ready/launched sometime in May.
@PettitWesley May... this year? Any update on this? It'd be a very useful feature for us.
@silvervest Yeah it was supposed to be May of this year. Progress has been made upstream but the launch is delayed till sometime in June.
This is launching very soon: https://github.com/fluent/fluent-bit/issues/337#issuecomment-882953961
Just to clarify, is the multi-line support now available for use in this image? Or are we still awaiting that implementation?
Hi @aaronrl95, it was included in v2.18.0.
Ah great, thank you. Could you point me to the documentation around implementing that feature in our firelens configuration? I'm struggling to find any
You can follow this Firelens example.
@hossain-rayhan thank you for that, that's just what I'm looking for
@hossain-rayhan Does this solution also applicable for JSON format logs produced by Docker container ?
@hossain-rayhan Does this solution also applicable for JSON format logs produced by Docker container ?
@zhonghui12 or @PettitWesley can you answer this?
@hossain-rayhan Does this solution also applicable for JSON format logs produced by Docker container ?
@zhonghui12 or @PettitWesley can you answer this?
I assume that if the JSON format logs are split into multiple lines, then it can be concatenated as there is no obvious limit here: https://docs.fluentbit.io/manual/pipeline/filters/multiline-stacktrace. But maybe @PettitWesley can give a more certain answer here.
Or maybe we should help to test it out.
@hossain-rayhan @zhonghui12 @PettitWesley hi guys, I've been trying to use multiline support to concat partial messages splitted by containerd (AWS Fargate), however it didn't work. I've been using approach described by @hossain-rayhan with the following config:
[SERVICE]
Flush 1
Grace 30
Log_Level debug
[FILTER]
name multiline
match *
multiline.key_content log
multiline.parser cri, docker
Could you please take a look, thanks!
More details on my setup and what I'm trying to achieve: I have a Spring Boot App that logs to stdout using Logstash-logback-encoder to log in JSON format (one JSON log entry per line). There's a JSON field called "stack_trace" that may be very long. When the log line is longer > 16k chars (which usually occurs for a stack trace), containerd (AWS Fargate 1.4 runtime) splits it into several parts. Then Fluent bit receives those JSON parts. At this point I'd like Fluent Bit to merge them and parse as JSON. However, as I said, this is what I fail to get working right now
@StasKolodyuk you to create a custom multiline parser I think. I don't know exactly how to solve this use case with the new multiline support. I suspect with a custom parser with a custom regex it should be possible.
https://docs.fluentbit.io/manual/pipeline/filters/multiline-stacktrace
@vinaykrish25aws Yes the new filter will work with json logs from Docker. In that case, the log content is in the log
key and you specify that key in the filer:
multiline.key_content log
If the content of that key is itself nested json that need to be recombined or something then that's a more complicated use case which might need custom parser and/or additional parsing steps.
Hi, I have similar problem. We also have json logs split by docker running on AWS Fargate cluster. I don't think that json is really matters here because it is just a string. But even with mutiline filter - fluentbit can't concatenate such logs. I double checked that our logs have log key and all configurations are as same as in documentation.
Following configuration is not working for me, not merging java stack-trace to single entry. Any thoughts?
Dockerfile
parsers_multiline.conf
extra.conf
Section from Task Definition
@shijupaul Unfortunately, since this feature is new, we are still learning and understanding as well, and there are very few working examples that we have as well... so right now everyone is figuring it out.
So actually, if you or anyone in this thread get a working example for a use case you think is decently common, please do share it. This will benefit the community. I'm also slowly working on slowly improving our FireLens/Fluent Bit FAQ/examples, and this data can be used for that.
Can you share what these java stack traces look like? And I recommend that you (and everyone) test their own logs with the regular expressions that you write in the multiline parser using the rubular website: https://rubular.com/
If the regex's don't work there with your logs... then that's the problem. That should be your first debug step.
Hello 👋 I thought I'd share my attempts as well here, as it might be useful to someone. I've been trying to get this to work for a couple days now as well, but so far without any luck. I have a pretty much identical setup as @shijupaul (I don't have the grep filter). I've playing around with these regexes quite a bit, but it doesn't seem to have any effect at all. Even if I put in a regex like /.*/
for both rules, you don't see any difference in the end result. I am getting the feeling now that problems is elsewhere to be honest.
To verify my hypothesis, I have been trying a couple of things:
[SERVICE]
block -> task failed to start, so the conf file is picked up/.*/
, no change in the outcomemultiline.key_content
, no change eitherI also ran it locally using fluent-bit -c multiline-parser.conf
. I tried to mimic the fargate config, but used a tail
input instead:
[SERVICE]
Parsers_File parsers.conf
Flush 1
Grace 30
[INPUT]
name tail
path log.txt
read_from_head true
[FILTER]
name multiline
match *
multiline.key_content log
multiline.parser multiline-regex-test
[FILTER]
Name parser
Match *
Key_Name log
Parser json
Reserve_Data True
[OUTPUT]
name stdout
match *
The interesting thing is that there I do see that it has as an effect. I can see how multiple log lines are combined. I have a couple of theories now:
multiline.key_content
field is not supposed to be log
, but something else. I don't have access to the raw logs yet, so it is a bit hard to verify.forward
input for some reason.Any tips or tricks are appreciated! In the meantime, I'll keep debugging
@lbunschoten @PettitWesley This is what I experienced as well...
Correct me if I'm wrong but I believe that the issue is the Source of the logs - Our images only get it as Forwarded messages from the emitter (https://github.com/aws/aws-for-fluent-bit/blob/mainline/fluent-bit.conf#L1-L4).
This might make it pointless to try to concat it through the use of metadata (like CRI's logtag or Docker's partial_message) because those could be filtered out or not forwarded to us in the first place.
That would match our experienced behavior here.
Yeah, I can see how those CRI and docker metadata options might have problems with a broken JSON structure, but you would still expect the regex solution to work, right?
I am going to try to use this forward plugin locally as well now, to see if it may be related to the input
@lbunschoten to be honest I have no idea how the regex is supposed to be working but I agree that it should be working with regex
As for the missing metadata it could be related to https://github.com/fluent/fluent-bit/issues/1072 - I mean, if that's even the case and if they even use fluentbit at the core-level...
At this stage it's just a lot of speculation and assumptions and I feel like we're all just wildly testing out things blindly to hope for a different result haha
@PettitWesley according to https://github.com/aws-samples/amazon-ecs-firelens-under-the-hood/blob/mainline/generated-configs/fluent-bit/README.md the logs from the docker fluentd log-driver but Fargate doesnt use Docker anymore:
One of the changes we are introducing in platform version 1.4 is replacing Docker Engine with Containerd as Fargate’s container execution engine.
https://aws.amazon.com/blogs/containers/under-the-hood-fargate-data-plane/
So what's being used there?
@lbunschoten I had very similar experience and thoughts. Here is related comment in a different thread
So Fargate -> aws-for-fluent-bit -> some output - doesn't concatenate logs even with /.*/ pattern But Fargate -> aws-for-fluent-bit -> another one fluent-bit (with the same version and configs) -> some output - concatenate all logs.
I also was able to see raw logs coming from Fargate 1.4 via Firelens. I used just forward raw logs to the output in aws-for-fluent-bit configs. But log
key was there. So from that standpoint everything looks well, but I probably missing something.
I just tried to get it to work locally using the forward
plugin, but I didn't manage :( If I swapped the forward
plugin with the tail
plugin, it was working just fine, so I think it is a combination of the forward
input with the multiline_parser
, but it might also just be my limited knowledge of fluent-bit
@opteemister That's very interesting (and odd) that it did work if you used the fluent-bit part twice. Do you still happen to have the configs and would you mind sharing them?
My configs were pretty the same as in a multiline parser example and in this comment
I was just playing with different custom regexp rules and even without using parsers_multiline.conf at all. Only docker and cri default multiline.parsers.
@PettitWesley I tried using a custom fluentd (instead of bit) image but I keep getting:
Stopped reason InternalError: unable to generate fireLens config file: unable to generate fireLens config content: unable to generate fluent config output section: unable to apply log options of container log-split to fireLens config: missing output key @type which i...
It's very annoying that the ECS Console cuts of the error right where you need it the most...
But that being said, I did basically https://github.com/aws-samples/amazon-ecs-firelens-examples/tree/mainline/examples/fluentd/multiline-logs.
What's missing? What output key?
Shouldnt the output be generated from:
LogConfiguration:
LogDriver: awsfirelens
Options:
Name: forward
Host: My-Fluentd-Host
Port: "24225"
//EDIT:
Looking at https://github.com/aws/amazon-ecs-agent/blob/master/agent/taskresource/firelens/firelensconfig_unix.go#L226 made me aware that I should call the Option @type
instead of Name
... Confusing and couldnt find it in the dos (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_firelens.html) but ok.
//EDIT2: Nope still same error...
LogConfiguration:
LogDriver: awsfirelens
Options:
"@type": forward
Host: fluentd-host
Port: "24225"
//EDIT3:
Removing the Options
from the block and hardcoding an output into the extra.conf
for fluentd didnt change anything either, same error... I dont think the parser is smart enough to reconstruct <server>
blocks needed so I guess @type forward
is not supported?
Any way to provide it something to keep it shut and just run my supplied config?
//EDIT4: Nailed it.
LogConfiguration:
LogDriver: awsfirelens
Options:
"@type": stdout
//EDIT5:
fluent/fluentd:latest
breaks Firelens because they got a magic entrypoint that creates the user etc so mounting the socket for fluentd fails with ECS console error: CannotStartContainerError: ResourceInitializationError: unable to create new container: mount callback failed on /tmp/containerd-mount654170867: no users found
Right so I cant seem to get Docker nor CRI partial tags when I debug through fluentd... (piping everything from the socket outwards)
So I guess that's the culprit here...
Maybe @PettitWesley can peek behind the scenes how the forwarder to firelens handles the logs, so the process that pushes them into the socket for firelens container to load them
I've been able to verify locally as well what @opteemister said. Having 2 fluent-bit services running in a row does indeed "fix" the concatenation of the logs. That's however not really what I'd like to run on production (if it is even possible).
Perhaps this gives @PettitWesley a clue at what the problem might be. No pressure ;)
@f0o
@PettitWesley according to https://github.com/aws-samples/amazon-ecs-firelens-under-the-hood/blob/mainline/generated-configs/fluent-bit/README.md the logs from the docker fluentd log-driver but Fargate doesnt use Docker anymore:
On Fargate we actually still use the docker code in a wrapper: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd
A lot of comments here... someone from my team will take a look.
A reminder that this is how FireLens works: https://aws.amazon.com/blogs/containers/under-the-hood-firelens-for-amazon-ecs-tasks/
@f0o You don't have to specify the log configuration options. You can just fully specify the output in the extra config file. And then the log configuration is just:
LogConfiguration:
LogDriver: awsfirelens
I recommend this style.
@PettitWesley I think something is going wrong when the effective configuration is created by aws FireLens and applied to the sidecar container.
FluentBit container alone can work fine and can parse the log correctly including stack trace To test this behaviour I have used the following configuration DockerFile
fluent-bit.conf
parsers_multiline.conf
When I run the container and inspect the logs, I can see that stack trace is processed correctly.
However using the configuration mentioned in my previous post with FireLens it doesn't. Each line in the stack trace will get pushed as separate entry. See the screenshot below
Our application is a standard Sprint Boot application, and the stack trace created are standard, setup also includes collecting the logs and pushing to ElasticSearch.
Finally I got it working with FluentD, and the stack trace is grouped correctly, and single entry is pushed to ES
@paul5-elsevier Did you create a container image that outputs the test.log file line by line to stdout for use in the firelens task definition? Can you share the Dockerfile for that with me and then I'll try to repro myself.
@PettitWesley Both @shijupaul and @paul5-elsevier are my accounts.
test.log file has been used to test fluent-bit in isolation. In our deployed environment, our app container write log to sysout or syserr and is configured to use ElasticSearch as destination
Our Sidecar container is configured to use Fluent-bit and has the following configuration
Let me know if you need any more information.
Changes I have tried are pushed to the fork (https://github.com/paul5-elsevier/amazon-ecs-firelens-examples), and it's under the branch feature/multiline-processing
@PettitWesley I agree with @paul5-elsevier, meaning that whatever breaks the multiline filter in aws-for-fluent-bit only happens when running in AWS. The same parser configuration running locally using the latest aws-for-fluent-bit image from ECR works as expected when using either the tail or forward inputs. In the latter case, I used a second instance of the aws-for-fluent-bit container to tail a log and output it using the forward protocol. The only time I saw anything similar running locally was when I used the head input, which seems to struggle with very long lines, even if the buffer size is increased. I also configured aws-for-fluent-bit running in AWS to dump the incoming log data to stdout, meaning that it ends up in CloudWatch. From there, I can see that each incoming message has its contents stored in a field called log and split messages are simply represented as two consecutive messages where the contents of log should be concatenated, which is exactly what my multiline filter is intended to do. Given that the log data is visible in its entirety (albeit split into two message) it seems that it is not being truncated on the way into aws-for-fluent-bit, so I am left wondering about the effective runtime configuration being used by aws-for-fluent-bit when running in AWS.
I'm not sure whether it is related or not but after this comment above:
On Fargate we actually still use the docker code in a wrapper: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd
I noticed that there are hard limits for logs' length in a wrapper too. (Used same sizes as Docker for splitting logs)
@opteemister Which limits are you referring to? I noticed the default max-buffer-size of 1m, but given that splitting seems to happen at the 16 KiB container runtime limit, this buffer size should already be sufficient. My best guess right now is that regular expressions in multiline parser rules are somehow mangled when user-supplied config is injected into aws-for-fluent-bit running in AWS, causing them to never match. When running locally, I have the luxury of being able to completely replace the runtime configuration.
Here are several places that I found: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd/blob/master/logger/common.go#L44 https://github.com/aws/amazon-ecs-shim-loggers-for-containerd/blob/master/logger/common.go#L50
Probably line 50 could be related somehow. But I'm still not sure. If there is: container logs -> Wrapper(Firelens) -> fluent-bit - then it shouldn't be related because all limits are before fluent-bit But if there is (somehow): container logs -> fluent-bit -> wrapper - then it can be related.
But your assumption about that regular expressions in multiline parser rules are somehow mangled make sense. I had the same conclusion, but didn't have any strict facts of it.
I was thinking why that Wrapper can't concatenate logs by itself? https://github.com/aws/aws-for-fluent-bit/issues/25#issuecomment-907748568
@opteemister Thanks for the pointers. Since my last message, I modified the command of the aws-for-fluent-bit container running in AWS to output all of the generated config and I can see that user-supplied config is simply inserted via @INCLUDE
after some inputs are defined and metadata fields have been added using a record modifier filter. The strange thing is that if I replicate this configuration locally, then it still works as expected. The only thing I changed is replacing an ES output with a stdout one.
Hi, Reading through this bug, is it fair to say that multi-line log parsing on Firelens + Fluentbit... just doesn't work? I'm trying it out on our services, and these are the configurations: (we forward to Sumo Logic) https://gist.github.com/jawon-benchling/1b991f01c533aaf8d9505f26e265c850
Not having multi-line log parsing is a dealbreaker for us.
Thank you!
@marksumm
From there, I can see that each incoming message has its contents stored in a field called log and split messages are simply represented as two consecutive messages where the contents of log should be concatenated, which is exactly what my multiline filter is intended to do. Given that the log data is visible in its entirety (albeit split into two message) it seems that it is not being truncated on the way into aws-for-fluent-bit, so I am left wondering about the effective runtime configuration being used by aws-for-fluent-bit when running in AWS.
As others noted, this is probably because most container runtimes that I know of (both Docker and containerd with the shim loggers that we use in Fargate) truncate logs at 16KB. Is this what you are seeing?
If so, then I'm not certain if the new multiline feature can help re-concatenate these split logs. We are tracking this internally though as feature gap. And we have this old issue: https://github.com/aws/aws-for-fluent-bit/issues/25
As part of this we need to fix the fact that Fargate PV 1.4 does not set the partial message indicator.
@PettitWesley I'm afraid that you've completely missed the point and I have already mentioned the 16KiB limit. The limit causes messages to be split into multiple parts, which is not truncation. Ideally, this would be handled transparently on the AWS side. However, given that this does not currently happen, several of us have attempted to work around the issue by using the multiline parsing feature of Fluent Bit. Please note that this doesn't need to rely on any special container runtime metadata, as simple awareness of the content of a typical log message (for example, if the first line starts with a timestamp) is enough to form the basis of a regular expression parser.
I already proved that both parts of a split message are arriving at Fluent Bit while running in AWS, with each part being stored in the "log" field of a separate event. Fortunately, the Fluent Bit multiline parser is able to operate on fields as well as raw messages.
Now, the example multiline parser configuration from Fluent Bit was already copied into AWS Firelens examples, which implies that it should work. However, I doubt very much that it does. Any multiline parser configuration I successfully test locally mysteriously stops working when it is deployed to AWS.
@marksumm
Please note that this doesn't need to rely on any special container runtime metadata, as simple awareness of the content of a typical log message (for example, if the first line starts with a timestamp) is enough to form the basis of a regular expression parser.
Good point.
I wanted to note though that the partial_message flag set by the runtime option is the most fully generic approach, which will solve all use cases. And thus I am attempting to get that prioritized.
Separately, there is still this issue- that the new multiline feature doesn't work in ECS FireLens. I've added repro'ing this on my TODO list.
Apologies for the inconvenience everyone is experiencing with this; I know many have been waiting for generic multiline support for a long time, AWS will work with the upstream community to get it fully working.
We have the following configuration
We want to have multiline logs for stack trace etc.
How should I configure
fluentbit