MerlinDMC / fluent-plugin-input-gelf

A GELF input for Fluentd
MIT License
8 stars 15 forks source link

Memory leak when logging huge messages from docker #5

Closed stszap closed 6 years ago

stszap commented 6 years ago

We run fluentd in docker container to collect logs from other containers via gelf driver. We use docker version 17.03.1-ce on Rancher OS v1.0.2-rc3. If container generates a lot of long strings then fluentd starts to use huge amount of memory. We can easily reproduce this with custom docker image.

fluent.conf

<source>
  @type gelf
  tag docker.local
  bind 0.0.0.0
  port 12000
</source>

<match **>
  @type null
</match>

Dockerfile

FROM fluent/fluentd:v0.12.40-debian

RUN gem install fluent-plugin-input-gelf
COPY fluent.conf /fluentd/etc/fluent.conf

start fluentd container docker build -t fluent-test . && docker run -it --net host fluent-test

start another container to generate logs

docker run -it --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12000 debian \
   bash -c 'cat /dev/urandom | base64 -w 0 |head -c 100000 > /tmp/log; for i in {1..10000}; do cat /tmp/log; done' > /dev/null

After that fluentd uses over 1gb of ram and never releases it.

But with this command memory usage stays at about only 90mb (container generates 10 times more messages that are 10 times smaller).

docker run  --rm -it --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12000 debian \
   bash -c 'cat /dev/urandom | base64 -w 0 |head -c 10000 > /tmp/log; for i in {1..100000}; do cat /tmp/log; done' > /dev/null

I experimented with length of lines and I can't reproduce the problem if lines are shorter than 16000 (maybe it's related to docker buffer length of 16kb?). I also tried the same tests with fluentd log driver and forward input, but memory usage always returned to 50-80mb so I believe that problem should be in gelf input. I also tried 1.14 version, RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0.9 env variable and sending SIGUSR1 signal without any results.

stszap commented 6 years ago

So I discovered few things:


- Docker sometimes looses gelf messages (I observed about 0.4% packet loss on a local machine, but it maybe just my setup )
- It seems that the memory leak is in gelfd2. I made a patch and it helped me. I also created a pull request (the link is above)

I will close this issue for now.