fluent / fluentd-docker-image

Docker image for Fluentd
https://hub.docker.com/r/fluent/fluentd/
Apache License 2.0
464 stars 351 forks source link

Segmentation fault after updating from version 1.16.3 to 1.16.4 #378

Closed HolgerHees closed 2 months ago

HolgerHees commented 8 months ago

Describe the bug

After updating the official docker container from fluent/fluentd:v1.16.3 to fluent/fluentd:v1.16.4 I got a segmentation fault during startup which end in a endless starting loop.

Additionally I have the following gem modules installed

To Reproduce

just update and restart

Expected behavior

should not crash

Your Environment

- Fluentd version: 1.16.4
- TD Agent version:
- Operating system: Docker with opensuse leap 15.5
- Kernel version: 5.14.21-150500.55.52-default

Your Configuration

as it happens in journald, I write my journald configuration part here

<source>
  @type systemd
  tag systemd
  path /var/log/journal
  matches [{ "PRIORITY": [0,1,2,3,4,5,6] }]
  <storage>
    @type local
    persistent false
    path systemd.pos
  </storage>
  <entry>
    fields_strip_underscores true
    fields_lowercase true
    #field_map {"MESSAGE": "log", "_PID": ["process", "pid"], "_CMDLINE": "process", "_COMM": "cmd"}
  </entry>
</source>

Your Error Log

Mär 23 09:14:12 marvin fluentd[1519]: /usr/local/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325: [BUG] Segmentation fault at 0x0000000000000008
Mär 23 09:14:12 marvin fluentd[1519]: ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]
Mär 23 09:14:12 marvin fluentd[1519]: 
Mär 23 09:14:12 marvin fluentd[1519]: -- Control frame information -----------------------------------------------
Mär 23 09:14:12 marvin fluentd[1519]: c:0013 p:---- s:0064 e:000063 CFUNC  :free
Mär 23 09:14:12 marvin fluentd[1519]: c:0012 p:0012 s:0059 e:000058 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325
Mär 23 09:14:12 marvin fluentd[1519]: c:0011 p:0042 s:0053 e:000052 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal/navigable.rb:13
Mär 23 09:14:12 marvin fluentd[1519]: c:0010 p:0018 s:0047 e:000044 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:151
Mär 23 09:14:12 marvin fluentd[1519]: c:0009 p:0013 s:0040 e:000039 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:144
Mär 23 09:14:12 marvin fluentd[1519]: c:0008 p:0032 s:0034 e:000033 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:121 [FINISH]
Mär 23 09:14:12 marvin fluentd[1519]: c:0007 p:---- s:0030 e:000029 IFUNC 
Mär 23 09:14:12 marvin fluentd[1519]: c:0006 p:0012 s:0027 e:000026 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/fluentd-1.16.4/lib/fluent/plugin_helper/timer.rb:80 [FINISH]
Mär 23 09:14:12 marvin fluentd[1519]: c:0005 p:---- s:0022 e:000021 CFUNC  :run_once
Mär 23 09:14:12 marvin fluentd[1519]: c:0004 p:0034 s:0017 e:000016 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/cool.io-1.8.0/lib/cool.io/loop.rb:88
Mär 23 09:14:12 marvin fluentd[1519]: c:0003 p:0026 s:0012 e:000011 BLOCK  /usr/local/lib/ruby/gems/3.2.0/gems/fluentd-1.16.4/lib/fluent/plugin_helper/event_loop.rb:93
Mär 23 09:14:12 marvin fluentd[1519]: c:0002 p:0050 s:0008 e:000007 BLOCK  /usr/local/lib/ruby/gems/3.2.0/gems/fluentd-1.16.4/lib/fluent/plugin_helper/thread.rb:78 [FINISH]
Mär 23 09:14:12 marvin fluentd[1519]: c:0001 p:---- s:0003 e:000002 DUMMY  [FINISH]
Mär 23 09:14:12 marvin fluentd[1519]: 
Mär 23 09:14:12 marvin fluentd[1519]: -- Ruby level backtrace information

Additional context

No response

kenhys commented 8 months ago

Customize Dockerfile like this:

FROM fluent/fluentd:v1.16.4-debian-amd64-1.0

# Use root account to use apt
USER root

# below RUN includes plugin as examples elasticsearch is not required
# you may customize including plugins as you wish
RUN buildDeps="sudo make gcc g++ libc-dev" \
 && apt-get update \
 && apt-get install -y --no-install-recommends $buildDeps \
 && sudo gem install fluent-plugin-systemd \
 fluent-plugin-record-modifier \
 fluent-plugin-grafana-loki \
 fluent-plugin-rewrite-tag-filter \
 && sudo gem sources --clear-all \
 && rm -rf /var/lib/apt/lists/* \
 && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem

COPY fluent.conf /fluentd/etc/

Then build image, use custom build may cause SEGV.

 docker run --rm -it -v /var/log/journal:/var/log/journal 378
2024-03-25 02:29:46 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2024-03-25 02:29:46 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2024-03-25 02:29:46 +0000 [info]: gem 'fluentd' version '1.16.4'
2024-03-25 02:29:46 +0000 [info]: gem 'fluent-plugin-grafana-loki' version '1.2.20'
2024-03-25 02:29:46 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.2.0'
2024-03-25 02:29:46 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2024-03-25 02:29:46 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2024-03-25 02:29:46 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type systemd
    tag "systemd"
    path "/var/log/journal"
    matches [{"PRIORITY":[0,1,2,3,4,5,6]}]
    <storage>
      @type "local"
      persistent false
      path "systemd.pos"
    </storage>
    <entry>
      fields_strip_underscores true
      fields_lowercase true
    </entry>
  </source>
</ROOT>
2024-03-25 02:29:46 +0000 [info]: starting fluentd-1.16.4 pid=7 ruby="3.2.3"
2024-03-25 02:29:46 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/local/bundle/bin/fluentd", "--config", "/fluentd/etc/fluent.conf", "--plugin", "/fluentd/plugins", "--under-supervisor"]
2024-03-25 02:29:47 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2024-03-25 02:29:47 +0000 [info]: adding source type="systemd"
2024-03-25 02:29:47 +0000 [info]: #0 starting fluentd worker pid=16 ppid=7 worker=0
2024-03-25 02:29:47 +0000 [info]: #0 fluentd worker is now running worker=0
2024-03-25 02:29:48 +0000 [warn]: #0 no patterns matched tag="systemd"
free(): invalid pointer
2024-03-25 02:29:48 +0000 [error]: Worker 0 exited unexpectedly with signal SIGABRT
kenhys commented 8 months ago

Workaround: disable jemalloc in customized container image.

Set empty LD_PRELOAD="".

docker run --rm -it -e LD_PRELOAD="" -v /var/log/journal:/var/log/journal 378
2024-03-25 02:38:00 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2024-03-25 02:38:00 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2024-03-25 02:38:00 +0000 [info]: gem 'fluentd' version '1.16.4'
2024-03-25 02:38:00 +0000 [info]: gem 'fluent-plugin-grafana-loki' version '1.2.20'
2024-03-25 02:38:00 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.2.0'
2024-03-25 02:38:00 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2024-03-25 02:38:00 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2024-03-25 02:38:01 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type systemd
    tag "systemd"
    path "/var/log/journal"
    matches [{"PRIORITY":[0,1,2,3,4,5,6]}]
    <storage>
      @type "local"
      persistent false
      path "systemd.pos"
    </storage>
    <entry>
      fields_strip_underscores true
      fields_lowercase true
    </entry>
  </source>
</ROOT>
2024-03-25 02:38:01 +0000 [info]: starting fluentd-1.16.4 pid=7 ruby="3.2.3"
2024-03-25 02:38:01 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/local/bundle/bin/fluentd", "--config", "/fluentd/etc/fluent.conf", "--plugin", "/fluentd/plugins", "--under-supervisor"]
2024-03-25 02:38:01 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2024-03-25 02:38:01 +0000 [info]: adding source type="systemd"
2024-03-25 02:38:01 +0000 [info]: #0 starting fluentd worker pid=16 ppid=7 worker=0
2024-03-25 02:38:01 +0000 [info]: #0 fluentd worker is now running worker=0
2024-03-25 02:38:02 +0000 [warn]: #0 no patterns matched tag="systemd"
2024-03-25 02:38:03 +0000 [warn]: #0 no patterns matched tag="systemd"
daipom commented 8 months ago

Ref: https://github.com/fluent/fluent-package-builder/issues/369

kenhys commented 8 months ago

MEMO:

It seems that it was crashed here:

https://github.com/ledbettj/systemd-journal/blob/f3365c1147baeed2032b9c0ae223905d57216ce1/lib/systemd/journal.rb#L323-L327

    def self.read_and_free_outstr(ptr)
      str = ptr.read_string
      LibC.free(ptr)
      str
    end

LibC.free is called via read_and_free_outstr in Journal.cursor.

https://github.com/ledbettj/systemd-journal/blob/f3365c1147baeed2032b9c0ae223905d57216ce1/lib/systemd/journal/navigable.rb#L13

      def cursor
        out_ptr = FFI::MemoryPointer.new(:pointer, 1)
        if (rc = Native.sd_journal_get_cursor(@ptr, out_ptr)) < 0
          raise JournalError, rc
        end

        Journal.read_and_free_outstr(out_ptr.read_pointer)
      end

It was assumed that out_ptr is allocated and should be freed. With jemalloc, this mechanism may not work as expected.

HolgerHees commented 8 months ago

would it make sense to open a bug report on ledbettj systemd-journal project?

LHCGreg commented 4 months ago

I got this issue as well when updating from 1.16.3 to 1.17.0. I'm rolling back to 1.16.3 instead of disabling jemalloc because it sounds like a memory bug that's probably still there, it's just that it crashes under jemalloc and not the stock malloc. It would be great if someone familiar with the code could open an issue in systemd-journal if that's where the problem is.

kenhys commented 4 months ago

I've tried it with more recent version of jemalloc to investigate this SEGV.

This problem is still reproduced.

docker run --rm -v /var/log/journal:/var/log/journal fluent-systemd
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
2024-07-22 02:26:06 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2024-07-22 02:26:06 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2024-07-22 02:26:06 +0000 [info]: gem 'fluentd' version '1.17.0'
2024-07-22 02:26:06 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2024-07-22 02:26:06 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type systemd
    tag "systemd"
    path "/var/log/journal"
    matches [{"PRIORITY":[0,1,2,3,4,5,6]}]
    <storage>
      @type "local"
      persistent false
      path "systemd.pos"
    </storage>
    <entry>
      fields_strip_underscores true
      fields_lowercase true
    </entry>
  </source>
</ROOT>
2024-07-22 02:26:06 +0000 [info]: starting fluentd-1.17.0 pid=2 ruby="3.2.4"
2024-07-22 02:26:06 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/local/bundle/bin/fluentd", "--config", "/fluentd/etc/fluent.conf", "--plugin", "/fluentd/plugins", "--under-supervisor"]
2024-07-22 02:26:07 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2024-07-22 02:26:07 +0000 [info]: adding source type="systemd"
2024-07-22 02:26:07 +0000 [info]: #0 starting fluentd worker pid=11 ppid=2 worker=0
2024-07-22 02:26:07 +0000 [info]: #0 fluentd worker is now running worker=0
2024-07-22 02:26:08 +0000 [warn]: #0 no patterns matched tag="systemd"
free(): invalid pointer
2024-07-22 02:26:08 +0000 [error]: Worker 0 exited unexpectedly with signal SIGABRT
ashie commented 2 months ago

Probably I got the reason.

Mär 23 09:14:12 marvin fluentd[1519]: c:0013 p:---- s:0064 e:000063 CFUNC  :free
Mär 23 09:14:12 marvin fluentd[1519]: c:0012 p:0012 s:0059 e:000058 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325
Mär 23 09:14:12 marvin fluentd[1519]: c:0011 p:0042 s:0053 e:000052 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal/navigable.rb:13
Mär 23 09:14:12 marvin fluentd[1519]: c:0010 p:0018 s:0047 e:000044 METHOD /usr/local/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:151

systemd-jounal gem calls libc's free() for FFI:Pointer: https://github.com/ledbettj/systemd-journal/blob/f3365c1147baeed2032b9c0ae223905d57216ce1/lib/systemd/journal.rb#L320-L327

    # some sd_journal_* functions return strings that we're expected to free
    # ourselves. This function copies the string from a char* to a ruby string,
    # frees the char*, and returns the ruby string.
    def self.read_and_free_outstr(ptr)
      str = ptr.read_string
      LibC.free(ptr)
      str
    end

When jemalloc is used, malloc() and free() families are replaced with jemalloc's one, so calling libc's one is inappropriate.

There was a pull request that fixes this issue: https://github.com/ledbettj/systemd-journal/pull/62

It looks good and the gem author also seems positive with this patch. But it's closed without merging by the patch author without any reason. Probably we should revive it.

kenhys commented 2 months ago

It seems that it will not crash anymore.

Test case: changed to use ptr.free(ptr) in systemd-journal ``` docker run --rm -v /var/log/journal:/var/log/journal -v ./fluentd/etc:/fluentd/etc test-systemd -c /fluentd/etc/fluent.conf Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg. fluentd -c /fluentd/etc/fluent.conf 2024-08-30 06:52:33 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil 2024-08-30 06:52:33 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf" 2024-08-30 06:52:33 +0000 [info]: gem 'fluentd' version '1.17.1' 2024-08-30 06:52:33 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5' 2024-08-30 06:52:33 +0000 [info]: using configuration file: @type systemd tag "systemd" path "/var/log/journal" matches [{"PRIORITY":[0,1,2,3,4,5,6]}] @type "local" persistent false path "/tmp/systemd.pos" fields_strip_underscores true fields_lowercase true 2024-08-30 06:52:33 +0000 [info]: starting fluentd-1.17.1 pid=2 ruby="3.2.5" 2024-08-30 06:52:33 +0000 [info]: spawn command to main: cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/local/bundle/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "--plugin", "/fluentd/plugins", "--under-supervisor"] 2024-08-30 06:52:34 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil 2024-08-30 06:52:34 +0000 [info]: adding source type="systemd" 2024-08-30 06:52:34 +0000 [info]: #0 starting fluentd worker pid=11 ppid=2 worker=0 2024-08-30 06:52:34 +0000 [info]: #0 fluentd worker is now running worker=0 ```
kenhys commented 2 months ago

I've created a PR for upsteam.

https://github.com/ledbettj/systemd-journal/pull/96

kenhys commented 2 months ago

checking https://github.com/ledbettj/systemd-journal/pull/97 alternative implementation. but, it can't load yet.

irb(main):001:0> require "systemd/journal/shim"
<internal:/usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:86:in `require': cannot load such file -- systemd/journal/shim (LoadError)        

.so is installed under:

/usr/local/bundle/gems/systemd-journal-1.4.2.1/lib/shim/shim.so
/usr/local/bundle/extensions/x86_64-linux/3.2.0/systemd-journal-1.4.2.1/shim/shim.so

Instead, shim/shim succeeds.

require "shim/shim"
=> true
kenhys commented 2 months ago

It should be:

diff --git a/ext/shim/extconf.rb b/ext/shim/extconf.rb
index 94abd76..a53b749 100644
--- a/ext/shim/extconf.rb
+++ b/ext/shim/extconf.rb
@@ -7,4 +7,4 @@ require "mkmf"
 # selectively, or entirely remove this flag.
 append_cflags("-fvisibility=hidden")

-create_makefile("shim/shim")
+create_makefile("systemd/journal/shim")
kenhys commented 2 months ago

Observing changes...

diff --git a/v1.17/debian/Dockerfile b/v1.17/debian/Dockerfile
index 4a245d1..43849c6 100644
--- a/v1.17/debian/Dockerfile
+++ b/v1.17/debian/Dockerfile
@@ -6,6 +6,8 @@ LABEL maintainer "Fluentd developers <fluentd@googlegroups.com>"
 LABEL Description="Fluentd docker image" Vendor="Fluent Organization" Version="1.17.1"
 ENV TINI_VERSION=0.18.0

+COPY systemd-journal-1.4.2.1.gem /fluentd/
+
 # Do not split this into multiple RUN!
 # Docker creates a layer for every RUN-Statement
 # therefore an 'apt-get purge' has no effect
@@ -24,6 +26,10 @@ RUN apt-get update \
  && gem install async -v 1.32.1 \
  && gem install async-http -v 0.64.2 \
  && gem install fluentd -v 1.17.1 \
+ && gem install ffi \
+ && gem install --local /fluentd/systemd-journal-1.4.2.1.gem \
+ && gem install fluent-plugin-systemd \
+ && gem install fluent-plugin-watch-objectspace \
  && dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')" \
  && wget -O /usr/local/bin/tini "https://github.com/krallin/tini/releases/download/v$TINI_VERSION/tini-$dpkgArch" \
  && wget -O /usr/local/bin/tini.asc "https://github.com/krallin/tini/releases/download/v$TINI_VERSION/tini-$dpkgArch.asc" \
@@ -53,7 +59,6 @@ RUN groupadd -r fluent && useradd -r -g fluent fluent \
     && mkdir -p /fluentd/etc /fluentd/plugins \
     && chown -R fluent /fluentd && chgrp -R fluent /fluentd

-
 COPY fluent.conf /fluentd/etc/
 COPY entrypoint.sh /bin/

Threshold is a bit strict (x1.1), so observed that error notification was fired. But after a while, it seems that garbage is collected.

Running docker image with: docker run --rm -v /var/log/journal:/var/log/journal -v ./fluent.conf:/fluentd/etc/fluent.conf test-systemd

Checking objectspace with modified version of systemd-journal (jemalloc) Configure fluent.conf using objectspace: ``` 2024-09-02 04:32:14 +0000 [info]: gem 'fluentd' version '1.17.1' 2024-09-02 04:32:14 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5' 2024-09-02 04:32:14 +0000 [info]: gem 'fluent-plugin-watch-objectspace' version '0.2.2' 2024-09-02 04:32:14 +0000 [warn]: define to capture fluentd logs in top level is deprecated. Use

Run with docker run --rm -e LD_PRELOAD="" -v /var/log/journal:/var/log/journal -v ./fluent.conf:/fluentd/etc/fluent.conf test-systemd

Checking objectspace with modified version of systemd-journal (without jemalloc) ``` 2024-09-02 06:05:38.647727965 +0000 fluent.info: {"pid":11,"ppid":2,"worker":0,"message":"starting fluentd worker pid=11 ppid=2 worker=0"} 2024-09-02 06:05:38.648081849 +0000 fluent.info: {"worker":0,"message":"fluentd worker is now running worker=0"} 2024-09-02 06:06:38.809338283 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9653473,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.38"} 2024-09-02 06:07:38.806110296 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8662929,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.41"} 2024-09-02 06:08:38.805546608 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8735651,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.42"} 2024-09-02 06:09:38.805812401 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8808373,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.44"} 2024-09-02 06:10:38.805683115 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8890975,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.45"} 2024-09-02 06:11:38.805102159 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8953817,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.48"} 2024-09-02 06:12:38.807603403 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9036699,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.50"} 2024-09-02 06:13:38.805742054 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9099261,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.52"} 2024-09-02 06:14:38.805160045 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9171983,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.53"} 2024-09-02 06:15:38.806271937 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9244705,"virt":334656,"res":49268,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.55"} 2024-09-02 06:16:38.806353625 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9317427,"virt":334656,"res":49396,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.57"} 2024-09-02 06:17:38.805089419 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9391765,"virt":334656,"res":49396,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.59"} 2024-09-02 06:18:38.805888938 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9462871,"virt":334656,"res":49396,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.60"} 2024-09-02 06:19:38.805785906 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9535593,"virt":334656,"res":49396,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.62"} 2024-09-02 06:20:38.805776461 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9618475,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.64"} 2024-09-02 06:21:38.804881057 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9681037,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.65"} 2024-09-02 06:22:38.805483038 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9753759,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.67"} 2024-09-02 06:23:38.805394790 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9836641,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.69"} 2024-09-02 06:24:38.805810557 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9899203,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.70"} 2024-09-02 06:25:38.805724709 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":9982085,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.73"} 2024-09-02 06:26:38.805056471 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10044647,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.74"} 2024-09-02 06:27:38.806097756 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10117369,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.76"} 2024-09-02 06:28:38.805594177 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10191707,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.78"} 2024-09-02 06:29:38.805204034 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10262813,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.79"} 2024-09-02 06:30:38.805725320 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10335535,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.81"} 2024-09-02 06:31:38.805358220 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10418417,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.82"} 2024-09-02 06:32:38.805569991 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10482595,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.84"} 2024-09-02 06:33:38.806291716 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":10563821,"virt":334656,"res":49524,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:00.87"} 2024-09-02 06:34:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10626423.000000 > 9653473.000000 * 1.100000 2024-09-02 06:34:38.805773365 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10626423.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:35:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10712385.000000 > 9653473.000000 * 1.100000 2024-09-02 06:35:38.805662983 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10712385.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:36:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10777331.000000 > 9653473.000000 * 1.100000 2024-09-02 06:36:38.805643655 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10777331.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:37:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10852437.000000 > 9653473.000000 * 1.100000 2024-09-02 06:37:38.805948632 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10852437.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:38:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10927543.000000 > 9653473.000000 * 1.100000 2024-09-02 06:38:38.805547444 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 10927543.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:39:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11002649.000000 > 9653473.000000 * 1.100000 2024-09-02 06:39:38.805612780 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11002649.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:40:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11079371.000000 > 9653473.000000 * 1.100000 2024-09-02 06:40:38.805845787 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11079371.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:41:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11163021.000000 > 9653473.000000 * 1.100000 2024-09-02 06:41:38.805946544 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11163021.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:42:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11227967.000000 > 9653473.000000 * 1.100000 2024-09-02 06:42:38.805843220 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11227967.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:43:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11303073.000000 > 9653473.000000 * 1.100000 2024-09-02 06:43:38.805488662 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11303073.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:44:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11378179.000000 > 9653473.000000 * 1.100000 2024-09-02 06:44:38.806177493 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11378179.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:45:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11453285.000000 > 9653473.000000 * 1.100000 2024-09-02 06:45:38.805908487 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11453285.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:46:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11528391.000000 > 9653473.000000 * 1.100000 2024-09-02 06:46:38.806418155 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11528391.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:47:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11603497.000000 > 9653473.000000 * 1.100000 2024-09-02 06:47:38.805709270 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11603497.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:48:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11678603.000000 > 9653473.000000 * 1.100000 2024-09-02 06:48:38.807692770 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11678603.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:49:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11755325.000000 > 9653473.000000 * 1.100000 2024-09-02 06:49:38.806065946 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11755325.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:50:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11830431.000000 > 9653473.000000 * 1.100000 2024-09-02 06:50:38.806058050 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11830431.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:51:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11914041.000000 > 9653473.000000 * 1.100000 2024-09-02 06:51:38.807143765 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11914041.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:52:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11979027.000000 > 9653473.000000 * 1.100000 2024-09-02 06:52:38.806892087 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 11979027.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:53:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 12054133.000000 > 9653473.000000 * 1.100000 2024-09-02 06:53:38.806080154 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 12054133.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:54:38 +0000 [error]: #0 Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 12129239.000000 > 9653473.000000 * 1.100000 2024-09-02 06:54:38.810522738 +0000 fluent.error: {"message":"Memory usage is over than expected, threshold of memsize_of_all rate <1.100000>: 12129239.000000 > 9653473.000000 * 1.100000"} 2024-09-02 06:55:38.805600538 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":6806085,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.28"} 2024-09-02 06:56:38.807318261 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":6878807,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.29"} 2024-09-02 06:57:38.805900391 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":6951529,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.31"} 2024-09-02 06:58:38.805752906 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7024251,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.33"} 2024-09-02 06:59:38.805685477 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7096973,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.34"} 2024-09-02 07:00:38.805469318 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7171311,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.36"} 2024-09-02 07:01:38.805560007 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7242417,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.38"} 2024-09-02 07:02:38.806171352 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7315139,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.40"} 2024-09-02 07:03:38.805570770 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7389477,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.42"} 2024-09-02 07:04:38.804820767 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7460583,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.43"} 2024-09-02 07:05:38.805782277 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7533305,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.45"} 2024-09-02 07:06:38.805476434 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7616187,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.47"} 2024-09-02 07:07:38.805193123 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7678749,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.49"} 2024-09-02 07:08:38.804718945 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7753087,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.50"} 2024-09-02 07:09:38.806188525 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7825809,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.52"} 2024-09-02 07:10:38.807030799 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7896915,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.53"} 2024-09-02 07:11:38.805665628 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":7971253,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.55"} 2024-09-02 07:12:38.805206317 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8042359,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.57"} 2024-09-02 07:13:38.805196288 +0000 watch_objectspace: {"pid":11,"count":{"systemd::journal":1},"memory_leaks":false,"memsize_of_all":8115081,"virt":334784,"res":50036,"shr":13840,"%cpu":0.0,"%mem":0.1,"time+":"0:01.59"} ```

fix-with-jemalloc.log fix-without-jemalloc.log

kenhys commented 2 months ago

Observed memory consumption with/without jemalloc (with fix)

image

NOTE: processed systemd amount of events vary, so strictly speaking, it is not fair to compare with it. I want to check just "leaks".

kenhys commented 2 months ago

I've checked with integrated fixed version (systemd-journal 2.0.0) into test container with/without jemalloc again.

It seems that same tendency was shown from the attached logs.

fix-with-jemalloc-2.0.0.log.gz fix-without-jemalloc-2.0.0.log.gz

So, it was resolved in systemd-journal 2.0.0.

kenhys commented 2 months ago

I've sent a feedback to adopt systemd-journal 2.0.0 https://github.com/fluent-plugins-nursery/fluent-plugin-systemd/pull/111

kenhys commented 2 months ago

This issue was fixed via fluent-plugin-systemd 1.1.0. (which uses systemd-journal 2.0.0)

Please use fluent-plugin-systemd 1.1.0.

https://github.com/fluent-plugins-nursery/fluent-plugin-systemd/releases/tag/v1.1.0