fluent / helm-charts

Helm Charts for Fluentd and Fluent Bit
Apache License 2.0
366 stars 438 forks source link

Fluentbit is not flushing old/stale node exporter metrics #494

Open vaitiwari opened 2 months ago

vaitiwari commented 2 months ago

Fluentbit is not flushing or removing old node exprter device metrics which are no longer part of the system and still exposes it. Only restarting the node exporter would solve this issue. I am able to reproduce the issue on my local.


1) Port forward any fluentbit pod and curl http://localhost:2021/metrics 2) Note the number of metrics exposed for example node_network_transmit_colls_total 3) Now delete some pods and create some new pods on the node where the fluentbit pod is port forwarded. 4) Notice that increase in number of metrics exposed at http://localhost:2021/metrics for node_network_transmit_colls_total

Expcted Result: The metrics for deleted pods should have been removed.

Actual Result: The metrics for deleted pods are still persisted and are exposed. We need to restart the fluentbit pods to get rid of stale metrics.

Please note this issue is causing the memory pressure and pods are getting OOMkilled.

Below is the detail of env used:

Below is the fluentbit configmap:

        Name docker_no_time
        Format json
        Time_Keep Off
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
  fluent-bit.conf: |
        Daemon Off
        Flush 1
        Log_Level debug
        Parsers_File /fluent-bit/etc/parsers.conf
        Parsers_File /fluent-bit/etc/conf/custom_parsers.conf
        HTTP_Server On
        HTTP_Port 2020
        Health_Check On

        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        Tag kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On

        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Read_From_Tail On
        name node_exporter_metrics
        tag node_metrics
        scrape_interval 5
        path.procfs /proc
        path.sysfs /sys
        Mem_Buf_Limit 50MB
        filesystem.ignore_mount_point_regex `^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)`
        filesystem.ignore_filesystem_type_regex `^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$`

        Name kubernetes
        Match kube.*
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On

        name            prometheus_exporter
        match           node_metrics
        port            2021