aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 321 forks source link

[ECS Fargate] [request]: Partial_message field not included in platform version 1.4.0 #1550

Open PettitWesley opened 3 years ago

PettitWesley commented 3 years ago

Community Note

Tell us about your request

In all container runtimes that I am aware of, stdout/stderr messages from containers are split after a certain limit. In Docker, CRI, and containerd, IIRC the limit is always 16 KB. This means that if you emit messages larger than 16KB from your container, they will be split into multiple messages.

You can then use a tool like Fluentd (Fluent Bit support is something I am working on: https://github.com/aws/aws-for-fluent-bit/issues/25), to concatenate those records back into one. However, this relies on the runtime setting some flag to tell you that the series of messages you are receiving was actually originally one message.

The Fluentd Plugin Concat can join these back together: https://github.com/fluent-plugins-nursery/fluent-plugin-concat

To understand how this works, let's look at the example from the Fluentd plugin unit test: https://github.com/fluent-plugins-nursery/fluent-plugin-concat/blob/master/test/plugin/test_filter_concat.rb#L551

  sub_test_case "partial_key" do
    test "filter with docker style events" do
      config = <<-CONFIG
        key message
        partial_key partial_message
        partial_value true
      CONFIG
      messages = [
        { "container_id" => "1", "message" => "start", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 1", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 2", "partial_message" => "true" },
        { "container_id" => "1", "message" => "end", "partial_message" => "false" },
        { "container_id" => "1", "message" => "start", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 3", "partial_message" => "true" },
        { "container_id" => "1", "message" => " message 4", "partial_message" => "true" },
        { "container_id" => "1", "message" => "end", "partial_message" => "false" },
      ]
      filtered = filter(config, messages, wait: 3)
      expected = [
        { "container_id" => "1", "message" => "start\n message 1\n message 2\nend" },
        { "container_id" => "1", "message" => "start\n message 3\n message 4\nend" },
      ]
      assert_equal(expected, filtered)
    end

Which service(s) is this request for? ECS Fargate

Are you currently working around this issue?

I am not aware of any way to work around this.

PettitWesley commented 3 years ago

I suspect this will require changes in the shim loggers: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd

PettitWesley commented 2 years ago

I should note that my original examples are not complete, this is what real split logs should look like, note the multiple partial message fields:

{"source"=>"stdout", "log"=>"{"payload": "0123456789......01234567890abcdef", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"1", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig"}]
{"container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"{"payload": "..01234567890abcdef", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"1", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2"}]
{"partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"ghijklmnopqrstuvwxyz..0123456789", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"2"}]
{"log"=>"0...012", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"3", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout"}]
{"source"=>"stderr", "log"=>"ghijklmnopqrstuvwxyz..0123456789", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"2", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig"}]
{"partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"3", "partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"0...012", "partial_message"=>"true"}]
{"container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"34567890..abcdefghijklmnopqrstuv", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"4", "partial_last"=>"false"}]
{"partial_last"=>"false", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr", "log"=>"34567890..abcdefghijklmnopqrstuv", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"4"}]
{"container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stdout", "log"=>"wxyz....01234567890.", "event_id": 692, "counter": 0, "global_counter": 0, "time": "2022-01-30 23:36:32.001007"}", "partial_message"=>"true", "partial_id"=>"dc37eb08b4242c41757d4cd995d983d1cdda4589193755a22fcf47a638317da0", "partial_ordinal"=>"5", "partial_last"=>"true"}]
{"log"=>"wxyz....01234567890.", "event_id": 692, "counter": 0, "global_counter": 0, "time": "2022-01-30 23:36:32.001007"}", "partial_message"=>"true", "partial_id"=>"ecccce95711776e6a06d631af8e9227686446814eba7a87cb59b36bbaaad8b58", "partial_ordinal"=>"5", "partial_last"=>"true", "container_id"=>"a96998303938eab6087a7f8487ca40350f2c252559bc6047569a0b11b936f0f2", "container_name"=>"/hopeful_taussig", "source"=>"stderr"}]
marksumm commented 2 years ago

What will it take to get this prioritised? Is there any ETA? It seems that related discussions have been rumbling on since at least November 2020: https://github.com/aws/aws-for-fluent-bit/issues/100

PettitWesley commented 2 years ago

@marksumm I can not provide an ETA or launch date, however, this hasn't been forgotten and I am working on it.

PettitWesley commented 2 years ago

@marksumm implementation complete: https://github.com/aws/amazon-ecs-shim-loggers-for-containerd/pull/24

marksumm commented 2 years ago

@marksumm implementation complete: aws/amazon-ecs-shim-loggers-for-containerd#24

Great news! Thanks for your work on this.

rcollette commented 12 months ago

@PettitWesley - What is this issue pending now?

remiflament commented 6 months ago

Can you please close this issue as it's been implemented?