Memory leak when logging huge messages from docker

We run fluentd in docker container to collect logs from other containers via gelf driver. We use docker version 17.03.1-ce on Rancher OS v1.0.2-rc3. If container generates a lot of long strings then fluentd starts to use huge amount of memory. We can easily reproduce this with custom docker image.

fluent.conf

<source>
  @type gelf
  tag docker.local
  bind 0.0.0.0
  port 12000
</source>

<match **>
  @type null
</match>

Dockerfile

FROM fluent/fluentd:v0.12.40-debian

RUN gem install fluent-plugin-input-gelf
COPY fluent.conf /fluentd/etc/fluent.conf

start fluentd container docker build -t fluent-test . && docker run -it --net host fluent-test

start another container to generate logs

docker run -it --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12000 debian \
   bash -c 'cat /dev/urandom | base64 -w 0 |head -c 100000 > /tmp/log; for i in {1..10000}; do cat /tmp/log; done' > /dev/null

After that fluentd uses over 1gb of ram and never releases it.

But with this command memory usage stays at about only 90mb (container generates 10 times more messages that are 10 times smaller).

docker run  --rm -it --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12000 debian \
   bash -c 'cat /dev/urandom | base64 -w 0 |head -c 10000 > /tmp/log; for i in {1..100000}; do cat /tmp/log; done' > /dev/null

I experimented with length of lines and I can't reproduce the problem if lines are shorter than 16000 (maybe it's related to docker buffer length of 16kb?). I also tried the same tests with fluentd log driver and forward input, but memory usage always returned to 50-80mb so I believe that problem should be in gelf input. I also tried 1.14 version, RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0.9 env variable and sending SIGUSR1 signal without any results.

MerlinDMC / fluent-plugin-input-gelf

Memory leak when logging huge messages from docker #5