Closed kthhrv closed 2 weeks ago
met the same issue on a k8s cluster and trying to use fluentd image: fluent/fluentd-kubernetes-daemonset:v1.16.3-debian-elasticsearch8-2.1 with systemd plugin
the error :
fluentd-sbpzp fluentd 2024-01-28 08:20:57 +0000 [trace]: #0 [es_kubelet] writing events into buffer instance=2360 metadata_size=1
fluentd-sbpzp fluentd /fluentd/vendor/bundle/ruby/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325: [BUG] Segmentation fault at 0x0000000000000008
fluentd-sbpzp fluentd ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
fluentd-sbpzp fluentd
fluentd-sbpzp fluentd -- Control frame information -----------------------------------------------
fluentd-sbpzp fluentd c:0013 p:---- s:0064 e:000063 CFUNC :free
fluentd-sbpzp fluentd c:0012 p:0012 s:0059 e:000058 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325
fluentd-sbpzp fluentd c:0011 p:0042 s:0053 e:000052 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal/navigable.rb:13
fluentd-sbpzp fluentd c:0010 p:0018 s:0047 e:000044 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:151
fluentd-sbpzp fluentd c:0009 p:0013 s:0040 e:000039 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:144
fluentd-sbpzp fluentd c:0008 p:0032 s:0034 e:000033 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:121 [FINISH]
fluentd-sbpzp fluentd c:0007 p:---- s:0030 e:000029 IFUNC
fluentd-sbpzp fluentd c:0006 p:0012 s:0027 e:000026 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/timer.rb:80 [FINISH]
fluentd-sbpzp fluentd c:0005 p:---- s:0022 e:000021 CFUNC :run_once
fluentd-sbpzp fluentd c:0004 p:0034 s:0017 e:000016 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/cool.io-1.8.0/lib/cool.io/loop.rb:88
fluentd-sbpzp fluentd c:0003 p:0026 s:0012 e:000011 BLOCK /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/event_loop.rb:93
fluentd-sbpzp fluentd c:0002 p:0050 s:0008 e:000007 BLOCK /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78 [FINISH]
fluentd-sbpzp fluentd c:0001 p:---- s:0003 e:000002 DUMMY [FINISH]
fluentd-sbpzp fluentd
fluentd-sbpzp fluentd -- Ruby level backtrace information ----------------------------------------
looks like related to https://github.com/ledbettj/systemd-journal/issues/93
and also reported here: https://github.com/fluent/fluent-package-builder/issues/369
Workaround suggested in https://github.com/fluent/fluent-package-builder/issues/369 does remove the crashes. However, in my case (same as redliu312 in the previous message) systemd logs are not read with the workaround. Leaving the env variable untouched, fluentd crashes and upon restart, all the systemd logs are flushed and indexed.
Currently, we need to avoid this problem by unsetting LD_PRELOAD in some environments.
It is probably not a problem with this plugin, but it seems to be a problem with libjemalloc
and systemd-journal.
It's a problem to have a segmentation fault in the default state of some Fluentd distributions, so we need to consider possible solutions.
Hi, this issue is discussed over at https://github.com/ledbettj/systemd-journal/pull/96
The problem is that the native library, libsystemd
allocates memory using the system allocator, and then expects the calling code to free it. In most libraries, when they allocate memory they have a corresponding _free()
method which can be called to release it, but libsystemd does not in this case.
The caller must call the corresponding free() implementation that was used for malloc() -- libc, tcmalloc, jemalloc, etc -- otherwise the process will likely crash.
In Ruby/FFI land, I don't know any good way to determine which allocator was used by the native code. Trying to load the jemalloc library when the libC implementation is already in use will cause issues :(
One option might be to provide a separate native library shim that just exposes a free
wrapper (which would hopefully pick up on the LD_PRELOAD) but I don't know if that will be foolproof either.
Hi, this issue is discussed over at ledbettj/systemd-journal#96
The problem is that the native library,
libsystemd
allocates memory using the system allocator, and then expects the calling code to free it. In most libraries, when they allocate memory they have a corresponding_free()
method which can be called to release it, but libsystemd does not in this case.The caller must call the corresponding free() implementation that was used for malloc() -- libc, tcmalloc, jemalloc, etc -- otherwise the process will likely crash.
In Ruby/FFI land, I don't know any good way to determine which allocator was used by the native code. Trying to load the jemalloc library when the libC implementation is already in use will cause issues :(
One option might be to provide a separate native library shim that just exposes a
free
wrapper (which would hopefully pick up on the LD_PRELOAD) but I don't know if that will be foolproof either.
Hi friends,
I've published a version 2.0.0 of systemd-journal gem which includes a native shim as described above to attempt to work around this allocator mismatch.
Please give it a try and let me know if you run into any issues.
It seems that this issue was fixed in systemd-journal 2.0.0, so we shipped fluent-plugin-systemd 1.1.0 which adopts systemd-journal 2.0.0 or later.
Please use fluent-plugin-systemd 1.1.0.
Works on 20.04 but on 22.04 fluentd silently fails.
I'm using latest fluentd
running fluentd manually with
I get
config