Closed Rmaabari closed 2 weeks ago
@Rmaabari this repo just hosts Helm charts and the Fluent Bit Aggregator chart is a convenient way to run Fluent Bit as a StatefulSet. Your actual configuration is input into the chart and isn't part of the chart logic.
If you're having trouble with Fluent Bit, have turned on debug logs and think there is an issue your best course of action would be to look at the existing issues and if none match open a new issue at fluent/fluent-bit.
Hi @stevehipwell, thanks for your replay. I am using fluent-bit agents using the original fluent-bit (DaemonSet) helm chart repo, and using your aggregator helm chart in StatefulSet.
in regards to logs, they seem like nothing unsual attaching some of the logs:
[2023/09/17 12:07:58] [debug] [out flush] cb_destroy coro_id=7942
[2023/09/17 12:07:58] [debug] [retry] re-using retry for task_id=1959 attempts=19
[2023/09/17 12:07:58] [ warn] [engine] failed to flush chunk '1-1694939682.183824748.flb', retry in 1069 seconds: task_id=1959, input=forward.0 > output=es.1 (out_id=1)
[2023/09/17 12:07:59] [debug] [output:es:es.1] task_id=1354 assigned to thread #1
[2023/09/17 12:07:59] [debug] [output:es:es.1] task_id=1642 assigned to thread #0
[2023/09/17 12:07:59] [debug] [output:es:es.1] task_id=685 assigned to thread #1
[2023/09/17 12:07:59] [debug] [upstream] KA connection #96 to elastic-elasticsearch:9200 has been assigned (recycled)
[2023/09/17 12:07:59] [debug] [upstream] KA connection #91 to elastic-elasticsearch:9200 has been assigned (recycled)
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [http_client] not using http_proxy for header
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [http_client] not using http_proxy for header
[2023/09/17 12:07:59] [debug] [upstream] KA connection #89 to elastic-elasticsearch:9200 has been assigned (recycled)
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [http_client] not using http_proxy for header
@Rmaabari the interesting logs would be from when the output to ES failed. But unless it's caused by a defect in the chart you're going to need to open an issue on the Fluent Bit repo to figure out if this is a bug or a configuration issue.
If you provide me the chart values you used and the steps to resolve a failure I can take a look. Also if you lose logs as part of this?
Have you checked the logs on the ES side to see if there is an issue there? If ES is erroring and FB has no persistence a restart fixing the issue would indicate that there is an issue with the configuration and/or log content.
@stevehipwell thanks again for the response! I will gladly supply you with the helm chart values.
values:
service:
type: NodePort
annotations: {}
httpPort: 2020
additionalPorts:
- name: http-forward
port: 24224
containerPort: 24224
protocol: TCP
config:
log_level: debug
http_listen: "0.0.0.0"
pipeline: |-
[INPUT]
name forward
listen 0.0.0.0
port 24224
[FILTER]
Name rewrite_tag
Match kube.*
Rule $syslog ^(true)$ syslog.* false
Emitter_Name re_emitted
[OUTPUT]
Name syslog
Match syslog.*
Host $SYSLOG_SERVER
Port 514
Retry_Limit false
Mode tcp
Syslog_Format rfc5424
Syslog_MaxSize 65536
Syslog_Hostname_Key hostname
Syslog_Appname_Key appname
Syslog_Procid_Key procid
Syslog_Msgid_Key msgid
Syslog_SD_Key uls@0
Syslog_Message_Key msg
[OUTPUT]
Name es
Match kube.*
HTTP_User $USER
HTTP_Passwd $PASS
tls Off
tls.verify Off
Host elastic-elasticsearch
Port 9200
Retry_Limit False
Trace_Error On
Trace_Output Off
Suppress_Type_Name On
Replace_Dots On
Buffer_Size False
Logstash_Prefix logstash
Logstash_Format On
Index logstash
[OUTPUT]
Name es
Match host.*
HTTP_User $USER
HTTP_Passwd $PASS
tls Off
tls.verify Off
Host elastic-elasticsearch
Port 9200
Retry_Limit False
Trace_Error On
Trace_Output Off
Suppress_Type_Name On
Replace_Dots On
Buffer_Size False
Logstash_Prefix logstash
Logstash_Format On
Index logstash
Since the log level is set to debug, I am unable to pinpoint exactly when logs ceased being sent to elastic. I have observed, however, that after a couple of hours without logs being sent to elastic, a very small number of logs are sent for a single minute (around 20 documents), and none of these logs have a K8S filter, and again no logs being sent.
The only thing that resolves this issue is restarting the statefulset, resulting in logs being sent to all expected outputs.
I will also submit an issue to the fluentbit original helm git repository.
here is a screen shot of the logs in kibana view
@Rmaabari how have you configured the persistence?
I'm not sure your ES output configuration is correct, it looks like you're not constraining retries and the buffer?
I'm currently on annual leave so can't get everything on a screen to review how you've got this set up. Please add a link to the FB issue you open in this issue.
@Rmaabari is this still an issue or did you manage to resolve it?
Issue Description:
Problem: After deploying the Fluent Bit Aggregator Helm Chart and running it for a few hours, it stops sending logs to Elasticsearch and Syslog, which are the intended destinations for log forwarding.
Expected Behavior: The Fluent Bit Aggregator should consistently and reliably forward logs to the specified Elasticsearch and Syslog destinations as configured in the Helm Chart.
Steps to Reproduce:
Deploy Fluent Bit Aggregator using the provided Helm Chart. Monitor the log forwarding functionality for a few hours. Observe that log forwarding to Elasticsearch and Syslog ceases after a certain period. Actual Results: After an initial period of successful log forwarding, Fluent Bit Aggregator stops sending logs to Elasticsearch and Syslog without any apparent errors or warnings.
Environment Details:
Kubernetes Cluster Version: 1.26 Fluent Bit Agents Version: 2.1.8 Fluent Bit Aggregator Version: 2.1.9 Elasticsearch Version: 8.9
aggregator config:
fluent-bit agents config: