fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.81k stars 1.58k forks source link

Fluent Bit 400 Bad Request when integrating with OpenSearch on EKS cluster #8753

Closed CSi-CJ closed 2 months ago

CSi-CJ commented 5 months ago

Bug Report

Describe the bug I built Fluent-bit in the EKS cluster to integrate AWS opensearch and access kibana across regions, but fluent-bit always reported an error 400 bad request. Your browser sent an invalid request.

To Reproduce

[2024/04/23 09:22:49] [ warn] [engine] failed to flush chunk '1-1713786418.224288613.flb', retry in 16 seconds: task_id=908, input=storage_backlog.2 > output=opensearch.0 (out_id=0) [2024/04/23 09:22:49] [ warn] [engine] failed to flush chunk '1-1713785331.83017891.flb', retry in 25 seconds: task_id=841, input=storage_backlog.2 > output=opensearch.0 (out_id=0) [2024/04/23 09:22:49] [ warn] [engine] chunk '1-1713780668.223932764.flb' cannot be retried: task_id=25, input=storage_backlog.2 > output=opensearch.0 [2024/04/23 09:22:49] [error] [output:opensearch:opensearch.0] HTTP status=400 URI=/_bulk, response:

400 Bad request

Your browser sent an invalid request.

[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bed30 75 in the next 5 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bf1b8 642 in the next 4 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bb400 35 in the next 7 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c6c60 291 in the next 11 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c73b8 962 in the next 25 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c3da8 260 in the next 6 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87ca400 211 in the next 5 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bfe10 298 in the next 5 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87cb328 967 in the next 20 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87d22b8 839 in the next 8 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c0040 121 in the next 12 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87ca630 1058 in the next 9 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bdea8 581 in the next 10 seconds [2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87d0af8 1059 in the next 7 seconds

- Steps to reproduce the problem:
- prepare two AWS accounts (optional)
- follow my configuration to build fluent-bit as below

**Expected behavior**
It is expected that the collected logs will be printed correctly in the fluent-bit pod and the output log files will be seen in kibana.

**Screenshots**
![image](https://github.com/fluent/fluent-bit/assets/57539387/85b816e8-eb6e-4a10-93b0-f64d7ddf7531)
![image](https://github.com/fluent/fluent-bit/assets/57539387/42610c53-99c6-4bf1-991b-18e793c1b5b8)

**Your Environment**
<!--- Include as many relevant details about the environment you experienced the bug in -->
* Version used: public.ecr.aws/aws-observability/aws-for-fluent-bit:stable
* Configuration: 

apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-config namespace: logging labels: k8s-app: fluent-bit data: fluent-bit.conf: | [SERVICE] Flush 5 Grace 30 Log_Level info Daemon off Parsers_File parsers.conf HTTP_Server ${HTTP_SERVER} HTTP_Listen 0.0.0.0 HTTP_Port ${HTTP_PORT} storage.path /var/fluent-bit/state/flb-storage/ storage.sync normal storage.checksum off storage.backlog.mem_limit 5M

@INCLUDE application-log.conf

application-log.conf: | [INPUT] Name tail Tag application. Exclude_Path /var/log/containers/cloudwatch-agent, /var/log/containers/fluent-bit, /var/log/containers/aws-node, /var/log/containers/kube-proxy Path /var/log/containers/.log multiline.parser docker, cri DB /var/fluent-bit/state/flb_container.db Mem_Buf_Limit 50MB Skip_Long_Lines On Refresh_Interval 10 Rotate_Wait 30 storage.type filesystem Read_from_Head ${READ_FROM_HEAD}

[INPUT]
    Name                tail
    Tag                 application.*
    Path                /var/log/containers/fluent-bit*
    multiline.parser    docker, cri
    DB                  /var/fluent-bit/state/flb_log.db
    Mem_Buf_Limit       5MB
    Skip_Long_Lines     On
    Refresh_Interval    10
    Read_from_Head      ${READ_FROM_HEAD}

[FILTER]
    Name                kubernetes
    Match               application.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_Tag_Prefix     application.var.log.containers.
    Merge_Log           On
    Merge_Log_Key       log_processed
    K8S-Logging.Parser  On
    K8S-Logging.Exclude Off
    Labels              Off
    Annotations         Off
    Use_Kubelet         On
    Kubelet_Port        10250
    Buffer_Size         0

[OUTPUT]
    Name opensearch
    Match application.*
    Host vpc-xxxxx.us-west-2.es.amazonaws.com
    Port 443
    Logstash_Format On
    Logstash_Prefix kube
    Logstash_DateFormat %Y.%m.%d.%H
    Retry_Limit False
    tls On
    AWS_Auth On
    AWS_Region ${AWS_REGION}
    Suppress_Type_Name On
    Type  _doc
    Trace_Error       On
    Replace_Dots      On

parsers.conf: | [PARSER] Name syslog Format regex Regex ^(?

[PARSER]
    Name                container_firstline
    Format              regex
    Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
    Time_Key            time
    Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

[PARSER]
    Name                cwagent_firstline
    Format              regex
    Regex               (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
    Time_Key            time
    Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

* Environment name and version (e.g. Kubernetes? What version?): kubernetes:1.25 
* Server type and version: eks.17
* Operating System and version: AMI AL2_x86_64
* Filters and plugins:

**Additional context**
Hope someone can help me integrate EKS and OpenSearch correctly
github-actions[bot] commented 2 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] commented 2 months ago

This issue was closed because it has been stalled for 5 days with no activity.