fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.74k stars 1.56k forks source link

memory usage of plugins #1722

Closed sysword closed 2 years ago

sysword commented 4 years ago

Is your feature request related to a problem? Please describe. I'm using fb on k8s to collect logs, with pod's memory limit set to 128MB, and I set Mem_Buf_Limit=5MB in my tail plugin, output plugin is es. But the memory usage of fb Keep growing util the 128MB limit is exceeded when big log file exists. I want to find out what happened in the memory because of the fact i already set the Mem_Buf_Limit. I tried to query the metrics and i got this : { "input":{ "tail.0":{ "records":23220353, "bytes":1314318730, "files_opened":18, "files_closed":0, "files_rotated":0 } }, "filter":{ "kubernetes.0":{ "drop_records":0, "add_records":0 } }, "output":{ "es.0":{ "proc_records":220350, "proc_bytes":94534878, "errors":0, "retries":35672, "retries_failed":6723 } } } I wonder how to find the memory usage of each plugin in this every time, not the total bytes of processed data. Since the Mem_Buf_Limit could work, i believe there is a way to know the memory usage of plugins. Describe the solution you'd like

I could know each plugin's memory usage from metrics.

edsiper commented 4 years ago

Please share your full configuration.

sysword commented 4 years ago

sorry for missing the configuration. With this config it works well firstly, but when some errors occour flunt bit will consume a lot of memory until exceed the resource limit and keeps in that way, for example , when my es can not consume the data that comes from fluent-bit and rejects it, fluent bit will retrun an error, in that cause, the memory usage of fb will grow. I indeed set the mem_buf_limit , but why fb comsume so much memory?


data:
  fluent-bit-filter.conf: "[FILTER]\n    Name                kubernetes\n    Match
    \              kube.*\n    Kube_Tag_Prefix     kube.var.log.containers.\n    Kube_URL
    \           https://kubernetes.default.svc:443\n    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n
    \   Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token\n
    \   Labels              Off\n    Annotations         Off\n    K8S-Logging.Parser
    \ On\n    K8S-Logging.Exclude On\n    \n"
  fluent-bit-input.conf: "[INPUT]\n    Name             tail\n    Path             /var/log/containers/*.log\n
    \   Parser           docker\n    Tag              kube.*\n    Refresh_Interval
    5\n    Mem_Buf_Limit    5MB\n    Skip_Long_Lines  On\n    DB               /tail-db/tail-containers-state.db\n
    \   DB.Sync          Normal\n   \n"
  fluent-bit-output.conf: "\n[OUTPUT]\n    Name  es\n    Match *\n    Host  elasticsearch\n
    \   Port  9200\n    Logstash_Format On\n    Retry_Limit 2\n    Type  flb_type\n
    \   Time_Key @timestamp\n    Replace_Dots On\n    Logstash_Prefix kubernetes_cluster\n\n\n\n
    \   \n"
  fluent-bit-service.conf: |
    [SERVICE]
        Flush        1
        Daemon       Off
        Log_Level    info
        Parsers_File parsers.conf
        Refresh_Interval 5
        HTTP_Server  On
        HTTP_Listen  0.0.0.0
        HTTP_Port    2020
  fluent-bit.conf: |
    @INCLUDE fluent-bit-service.conf
    @INCLUDE fluent-bit-input.conf
    @INCLUDE fluent-bit-filter.conf
    @INCLUDE fluent-bit-output.conf
  parsers.conf: ""
kind: ConfigMap
metadata:
  creationTimestamp: "2019-11-11T06:31:09Z"
  labels:
    app: fluent-bit
    chart: fluent-bit-2.7.0
    heritage: Tiller
    release: fluent-bit
  name: fluent-bit-config
  namespace: logging
  resourceVersion: "17196825"
  selfLink: /api/v1/namespaces/logging/configmaps/fluent-bit-config
  uid: d9149090-044c-11ea-a216-524b6597ff14```
edsiper commented 4 years ago

I tried to replicate the memory issue you mentioned without success.

Would you please provide more details about the memory consumption ? how are you monitoring that ?

sysword commented 4 years ago

I can not replicate this issue either, recently days ,it works well. Prometheus is used to monitoring my components.

bluebike commented 4 years ago

I have been thinking that the way plugin api works is not very memory efficient if many modifying filters are added.
One big chunk (like 5M?) is given to filter and if it modifys data it must allocate (with flb_malloc) a another bigger memory buffer.
Maybe other buffer needs even bigger buffer. During processing one chunk multiple copies of chunk needs dynamic memory. And then we can get some memory fragmentation to make things worse. Also filters (and outputs) depacks msgpacks quite ofter just to checking small things.

I haven't really tested (and metered), but I assume fluent-bit dynamic memory usage being sometimes surprisingly big in multi-filter case.

To fix this basically whole plugin-api should be changed.

Like. a) msgpack chunk would have some extra space => modifying filter would not always need to allocate new memory (assuming enough extra memory... ).

b) msgpack (read) could be done mostly functions which don't need dynamic memory allocation

c) whole basic procssing (input,filters) could be done using unpacked msgpacks.

d) chunk given to filter could be list to event mspgpacks.. which can be modified/removed/added individually.

But ... that could be for next major version...

edsiper commented 4 years ago

Interesting point. I am wondering if we can use some Jemalloc tricks to allocate memory per filter and each filter associated with one arena or something that we can later to monitor...

From msgpack perspective, there are many cases where unpacking to a new memory buffer is not required, as a workaround and silently we have 'mpack' which have more helpers for zero-copy on data processing/unpacking.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] commented 2 years ago

This issue was closed because it has been stalled for 5 days with no activity.

liguangcheng commented 1 year ago

@edsiper this is from fluent bit official website

A workaround for this backpressure scenario is to limit the amount of memory in records that an input plugin can register, this configuration property is called mem_buf_limit. If a plugin has enqueued more than the mem_buf_limit, it won't be able to ingest more until that data can be delivered or flushed properly

per my understanding, when i set the mem_buf_limit(20M) for a input, the input can consume memory max to the limit, but actually the memory alawys grow to 10G, this exceed far more 20M, why the fluent bit consume much memory

i have many many big files, Is it caused by these files?i also set the Buffer_Max_Size to 1M

liguangcheng commented 1 year ago

@edsiper I cannot understand the relationship between Buffer_Max_Size and mem_buf_limit, it seems both can affect the buffer the input use