Open June3Ningxu opened 2 weeks ago
Fluentbit version: amazon/aws-for-fluent-bit:debug-2.32.2.20240516 Deployment mode: AWS ECS Fargate sidecar Programming language: N/A Log format: JSON
After running for sometime, the fluentbit will eixt with error "[engine] caught signal (SIGSEGV)" so we use the debug version to catch the dump log
"logConfiguration": { "logDriver": "awsfirelens", "options": { "remove_keys": "ecs_task_arn", "label_keys": "$container_name,$ecs_task_definition,$source,$ecs_cluster,$container_id", "port": "******", "Host": "***********", "line_format": "key_value", "Name": "loki", "labels": "job=firelens" } }, { "name": "log_router", "image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:debug-2.32.2.20240516", "cpu": 0, "memoryReservation": 500, "portMappings": [], "essential": false, "environment": [ { "name": "S3_BUCKET", "value": "****************" }, { "name": "FLB_LOG_LEVEL", "value": "debug" } ], "mountPoints": [], "volumesFrom": [], "user": "0", "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/ecs-aws-firelens-sidecar-container", "awslogs-create-group": "true", "awslogs-region": "cn-north-1", "awslogs-stream-prefix": "firelens" } }, "systemControls": [], "firelensConfiguration": { "type": "fluentbit", "options": { "enable-ecs-log-metadata": "true" } } }
issue_2024-06-21T000043_host-ip-192-168-65-99.cn-north-1.compute.internal_7619239412270.all.zip issue_2024-06-20T190044_host-ip-192-168-194-12.cn-north-1.compute.internal_101311429628425.all.zip issue_2024-06-20T190044_host-ip-192-168-194-12.cn-north-1.compute.internal_101311429628425.core.zip
debug log
2024-06-21T08:00:41.393+08:00 | [2024/06/21 00:00:41] [debug] [input chunk] update output instances with new chunk size diff=598, records=1, input=forward.1 | 2024-06-21T08:00:42.374+08:00 | [2024/06/21 00:00:42] [debug] [task] created task=0x7f139e244b60 id=0 OK | 2024-06-21T08:00:42.374+08:00 | [2024/06/21 00:00:42] [debug] [upstream] KA connection #53 to 192.168.133.167:3100 has been assigned (recycled) | 2024-06-21T08:00:42.374+08:00 | [2024/06/21 00:00:42] [debug] [http_client] not using http_proxy for header | 2024-06-21T08:00:42.378+08:00 | [2024/06/21 00:00:42] [debug] [output:loki:loki.1] 192.168.133.167:3100, HTTP status=204 | 2024-06-21T08:00:42.378+08:00 | [2024/06/21 00:00:42] [debug] [upstream] KA connection #53 to 192.168.133.167:3100 is now available | 2024-06-21T08:00:42.378+08:00 | [2024/06/21 00:00:42] [debug] [out flush] cb_destroy coro_id=9393 | 2024-06-21T08:00:42.378+08:00 | [2024/06/21 00:00:42] [debug] [task] destroy task=0x7f139e244b60 (task_id=0) | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=558, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=440, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=426, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=405, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=333, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=596, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=414, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=494, records=1, input=forward.1 | 2024-06-21T08:00:42.386+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=1837, records=1, input=forward.1 | 2024-06-21T08:00:42.825+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=483, records=1, input=forward.1 | 2024-06-21T08:00:42.825+08:00 | [2024/06/21 00:00:42] [debug] [input chunk] update output instances with new chunk size diff=598, records=1, input=forward.1 | 2024-06-21T08:00:43.295+08:00 | [2024/06/21 00:00:43] [debug] [input chunk] update output instances with new chunk size diff=959, records=1, input=forward.1 | 2024-06-21T08:00:43.374+08:00 | [2024/06/21 00:00:43] [debug] [task] created task=0x7f139e244b60 id=0 OK | 2024-06-21T08:00:43.374+08:00 | [2024/06/21 00:00:43] [engine] caught signal (SIGSEGV)
debug-2.32.2.20240516
AWS ECS Fargate, fluent bit as a sidecar
300 logs lines Every minutes log line size: 1k
the ECS Fargate running for sometime, then it crash
Describe the question/issue
Fluentbit version: amazon/aws-for-fluent-bit:debug-2.32.2.20240516 Deployment mode: AWS ECS Fargate sidecar Programming language: N/A Log format: JSON
After running for sometime, the fluentbit will eixt with error "[engine] caught signal (SIGSEGV)" so we use the debug version to catch the dump log
Configuration
Fluent bit dump log
issue_2024-06-21T000043_host-ip-192-168-65-99.cn-north-1.compute.internal_7619239412270.all.zip issue_2024-06-20T190044_host-ip-192-168-194-12.cn-north-1.compute.internal_101311429628425.all.zip issue_2024-06-20T190044_host-ip-192-168-194-12.cn-north-1.compute.internal_101311429628425.core.zip
Fluent Bit Log Output
debug log
Fluent Bit Version Info
debug-2.32.2.20240516
Cluster Details
AWS ECS Fargate, fluent bit as a sidecar
Application Details
300 logs lines Every minutes log line size: 1k
Steps to reproduce issue
the ECS Fargate running for sometime, then it crash