Open PettitWesley opened 3 years ago
I suspect this will require changes in the shim loggers: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd
I should note that my original examples are not complete, this is what real split logs should look like, note the multiple partial message fields:
{"source"=>"stdout", "log"=>"{"payload": "0123456789......01234567890abcdef", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"1", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig"}]
{"container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"{"payload": "..01234567890abcdef", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"1", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2"}]
{"partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"ghijklmnopqrstuvwxyz..0123456789", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"2"}]
{"log"=>"0...012", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"3", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout"}]
{"source"=>"stderr", "log"=>"ghijklmnopqrstuvwxyz..0123456789", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"2", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig"}]
{"partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"3", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"0...012", "partial_message"=>"true"}]
{"container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"34567890..abcdefghijklmnopqrstuv", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"4", "partial_last"=>"false"}]
{"partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"34567890..abcdefghijklmnopqrstuv", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"4"}]
{"container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"wxyz....01234567890.", "event_id": 692, "counter": 0, "global_counter": 0, "time": "2022-01-30 23:36:32.001007"}", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"5", "partial_last"=>"true"}]
{"log"=>"wxyz....01234567890.", "event_id": 692, "counter": 0, "global_counter": 0, "time": "2022-01-30 23:36:32.001007"}", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"5", "partial_last"=>"true", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr"}]
What will it take to get this prioritised? Is there any ETA? It seems that related discussions have been rumbling on since at least November 2020: https://github.com/aws/aws-for-fluent-bit/issues/100
@marksumm I can not provide an ETA or launch date, however, this hasn't been forgotten and I am working on it.
@marksumm implementation complete: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd/pull/24
@marksumm implementation complete: aws/amazon-ecs-shim-loggers-for-containerd#24
Great news! Thanks for your work on this.
@PettitWesley - What is this issue pending now?
Can you please close this issue as it's been implemented?
Community Note
Tell us about your request
In all container runtimes that I am aware of, stdout/stderr messages from containers are split after a certain limit. In Docker, CRI, and containerd, IIRC the limit is always 16 KB. This means that if you emit messages larger than 16KB from your container, they will be split into multiple messages.
You can then use a tool like Fluentd (Fluent Bit support is something I am working on: https://github.com/aws/aws-for-fluent-bit/issues/25), to concatenate those records back into one. However, this relies on the runtime setting some flag to tell you that the series of messages you are receiving was actually originally one message.
The Fluentd Plugin Concat can join these back together: https://github.com/fluent-plugins-nursery/fluent-plugin-concat
To understand how this works, let's look at the example from the Fluentd plugin unit test: https://github.com/fluent-plugins-nursery/fluent-plugin-concat/blob/master/test/plugin/test_filter_concat.rb#L551
Which service(s) is this request for? ECS Fargate
Are you currently working around this issue?
I am not aware of any way to work around this.