Open carusology opened 3 years ago
I believe the length limit comes from docker. The standard solution is to concatenate logs into multilines using something like fluentd's concat plugin: https://github.com/fluent-plugins-nursery/fluent-plugin-concat
Fluent Bit, which is used in EKS Fargate, has multiline processing support in the tail plugin, but we do not currently allow customers to customize any input options. I'm not certain though if its multiline support covers this log truncation use case though.
I agree that this seems to be from Docker.
As for Fluent Bit configuration, can docker_mode
be set to on
in these auto-generated input options? That seems like it would solve the problem - at least up to the tail input's buffer size.
@carusology I checked and it looks like we are setting docker_mode
to On
in the built-in input. So now I am confused by this issue...
I think we're using the default buffer size, which is 32KB. So that should be when you see messages truncated, not at 16 KB...
Just to double check- how did you check that its truncating at 16KB?
I checked and it looks like we are setting
docker_mode
toOn
in the built-in input. So now I am confused by this issue...
Nuts! I was hoping it was that simple. 😞 I, too, am confused with what is causing this truncation then. That seemed like a probable cause without having visibility into the source code based upon the behavior I was experiencing.
Just to double check- how did you check that its truncating at 16KB?
Fair point. No hacked test here - I ran this using EKS Fargate Logging. The output I've included is literally what I got from CloudWatch / Kibana (latter is downstream of a Kinesis output) when I got an unhandled exception in a Spring Boot app.
Check out my examples.zip
file from the report. You'll see my EKS Fargate Logging ConfigMap and example.json file that I got from a normal JsonLayout
config. I also included what a similar message looks like when it gets split into two. The source example.json
message is ~20kb, but none-the-less it was split. The two split halves total to 23kb even with all the other decoration fluentd applies via my EKS Fargate Logging configuration.
You could reproduce this by emitting the contents of my example.json
file directly from a container with an [Output]
to CloudWatch via its EKS Fargate Logging configuration.
EKS Fargate uses containerd instead of docker. Setting the docker_mode
to on
won't be helpful as Fluent-bit expects the logs to be in json whereas the containerd writes raw log line.
Does it only happen in eks farget node? I tested normal node and found log is split in many lines
Community Note
Tell us about your request EKS Fargate Logging currently appears to support a maximum length of 16kb per logged line. I request that this maximum length:
Which service(s) is this request for? This is for EKS Fargate Logging. Specifically, the behavior described in this document.
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? I have a Spring Boot service written in Kotlin and I wanted to log its output in a "json lines" format (one Json object per line) by leveraging the common
JsonLayout
Log4j configuration. When exceptions are thrown and logged within the service, the stack trace is usually large enough that the resulting block of Json can be over 16kb. The EKS Fargate Logging worker splits this message in two, leaving the message in two strings that could not be parsed as Json, preventing it from being filtered upon downstream in a log viewing tool such as CloudWatch or Kibana. It's hard to filter within logging tools find these logs as you can't filter on its content via well formed fields as the Json didn't get parsed. Even if you do find the message, you have to manually stitch the messages back together to find out what happened.Are you currently working around this issue? I swapped our log4j configuration from
JsonLayout
toJsonTemplateLayout
. The latter has a configurablemaxStringLength
attribute and can "stringify" stack traces to they get emitted as a single string. When I set themaxStringLength
to10000
and set stack traces withstringified: true
, the stack traces are now truncated when they are large enough to trigger the splitting behavior. Since none of the other fields seem to total to more than ~6000 characters combined, the splitting of large messages has stopped.Additional context According to AWS documentation, EKS Fargate Logging is using FluentBit and generates its own
[Input]
blocks (Source (emphasis mine)):I believe these messages are running into the Docker daemon's internal/hardcoded 16kb limit for logged message before it flushes. The docker maintainers expect log parsing tools, such as Fluent Bit, to stitch these piecemeal messages back together again. Fluent Bit actually has an option to do this within the
Input
blocks usingdocker_mode
(Source):So I'm guessing the
Input
blocks generated from EKS Fargate Logging do not havedocker_mode
enabled. Assuming it is enabled, we'll run into limits around the size ofBuffer_Chunk_Size
(32kb by default) eventually as well. I have not observed logs being generated over around ~20kb from our service though, so that limit would at least be sufficient for us.Attachments I've attached three things:
aws-logging.yaml
file that maps to theConfigMap
used to parse EKS Fargate logs.example.json
JSON log file that the service emitted which is over 16kb.ConfigMap
being applied to a JSON log over 16kb being broken intoexample-split-first-half.json
andexample-split-second-half.json
.examples.zip