fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.85k stars 1.58k forks source link

Fluent-bit OOM on Tail Input while assembling big CRI log lines in Kubernetes #9097

Open juan-domenech opened 3 months ago

juan-domenech commented 3 months ago

Bug Report

Fluent-bit gets killed by Kubernetes due Out Of Memory while trying to assemble (very) long lines in Kubernetes no matter the value of Buffer_Max_Size.

Currently in Kubernetes, log lines longer than 16 KBytes get truncated and appended independently in the log files using partial flags (CRI log format). Fluent-bit Tail input can put them back together using multiline.parser cri but this operation does not take into account the configured memory limits. i.e. the assembling loop can go out of bounds if the final result is big enough.

This is a known issue and mentioned in the Multiline official documentation.

The multiline parser is not affected by the buffer_max_size configuration option, allowing the composed log record to grow beyond this size.

but when this growth is above the memory assigned to the container it will be killed creating disturbances in production environments.

To Reproduce

There are no Fluent-bit logs showing any issue. It dies before having a chance to say anything. This is the K8 node dmesg output:

[ 9495.772981] flb-pipeline invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[ 9495.776705] CPU: 1 PID: 136344 Comm: flb-pipeline Not tainted 5.10.207 #1
[ 9495.777567] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-stable202402-prebuilt.qemu.org 02/14/2024
[ 9495.778833] Call trace:
[ 9495.779823]  dump_backtrace+0x0/0x1b0
[ 9495.780610]  show_stack+0x18/0x30
[ 9495.781660]  dump_stack+0xe8/0x124
[ 9495.781992]  dump_header+0x48/0x1f8
[ 9495.782488]  oom_kill_process+0x224/0x230
[ 9495.783049]  out_of_memory+0x178/0x550
[ 9495.783560]  mem_cgroup_out_of_memory+0x130/0x150
[ 9495.783995]  try_charge+0x610/0x720
[ 9495.784287]  mem_cgroup_charge+0x7c/0x270
[ 9495.784666]  handle_mm_fault+0xca4/0x1370
[ 9495.784989]  do_page_fault+0x130/0x3a0
[ 9495.785305]  do_translation_fault+0xb0/0xc0
[ 9495.785744]  do_mem_abort+0x40/0xa0
[ 9495.786300]  el0_da+0x28/0x40
[ 9495.786657]  el0_sync_handler+0xe8/0x130
[ 9495.787001]  el0_sync+0x180/0x1c0
[ 9495.788071] memory: usage 102360kB, limit 102400kB, failcnt 13589
[ 9495.788563] swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 9495.788908] Memory cgroup stats for /kubepods/burstable/pod64f21c2f-0bfd-4e77-8b32-a4f681ce6d74:
[ 9495.788916] anon 103645184
               file 0
               kernel_stack 49152
               percpu 0
               sock 0
               shmem 0
               file_mapped 0
               file_dirty 135168
               file_writeback 135168
               anon_thp 0
               inactive_anon 103530496
               active_anon 0
               inactive_file 0
               active_file 151552
               unevictable 0
               slab_reclaimable 388552
               slab_unreclaimable 131136
               slab 519688
               workingset_refault_anon 0
               workingset_refault_file 528
               workingset_activate_anon 0
               workingset_activate_file 462
               workingset_restore_anon 0
               workingset_restore_file 363
               workingset_nodereclaim 0
               pgfault 1631817
               pgmajfault 0
               pgrefill 1001
               pgscan 8962
               pgsteal 6680
               pgactivate 4125
               pgdeactivate 1001
               pglazyfree 0
               pglazyfreed 0
               thp_fault_alloc 1188
               thp_collapse_alloc 0
[ 9495.799049] Tasks state (memory values in pages):
[ 9495.799357] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 9495.799936] [ 136297] 65535 136297      198        1    36864        0          -998 pause
[ 9495.800734] [ 136327]     0 136327    47108    27887   356352        0          -997 fluent-bit
[ 9495.801311] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=1777114cde7d273e9e609a956a64dbef90e552291d5208f44dbc3604a47ec9ae,mems_allowed=0,oom_memcg=/kubepods/burstable/pod64f21c2f-0bfd-4e77-8b32-a4f681ce6d74,task_memcg=/kubepods/burstable/pod64f21c2f-0bfd-4e77-8b32-a4f681ce6d74/1777114cde7d273e9e609a956a64dbef90e552291d5208f44dbc3604a47ec9ae,task=fluent-bit,pid=136327,uid=0
[ 9495.803907] Memory cgroup out of memory: Killed process 136327 (fluent-bit) total-vm:188432kB, anon-rss:101408kB, file-rss:10140kB, shmem-rss:0kB, UID:0 pgtables:348kB oom_score_adj:-997

Expected behavior Fluent-bit to skip partial log lines in the multiline.parser cri Tail loop when the memory limit is reached.

Your Environment

Workaround (No workaround necessary) Fortunately, Fluent-bit recovers gracefully after this and does not get into a crash loop (i.e. trying to assemble the same log lines again). By the looks of it, during the assemble loop it manages to update the file pointer (stored in DB) and the new instance picks it up where it left it. In some cases this could trigger another OOM if the long lines persist, but eventually the problematic data will be "processed" producing truncated log events (not ideal but not critical).

References There are other issues describing similar OOM problems like #5711 and #5685 but I think they relate to back pressure coming from Outputs. I think this is a different matter.

Thanks!

amanbisht commented 2 months ago

Is there any update on this? I can confirm that I'm also facing the same issue

juan-domenech commented 2 months ago

@amanbisht Can you do me the favour of validating my "To Reproduce" steps? It would give me peace of mind to know that others can reproduce this issue in their environments. Thank you!

amanbisht commented 2 months ago

In my case, we migrated from 2.1.3 to 3.1.6 and we started seeing a lot OOMs errors. I can confirm it's happening because of very long loglines. We identified the very long log lines and either suppressed them or shorten them, once we did that OOM related errors stopped showing up.