eht16 / python-logstash-async

Python logging handler for sending log events asynchronously to Logstash.
MIT License
186 stars 51 forks source link

Memory leak when logstash is not available #63

Closed xunto closed 10 months ago

xunto commented 3 years ago

Hello. My team recently faced a problem with python-logstash-async. When logstash server went into maintenance, the memory usage of one of our services started to rapidly grow.

Turns out this is because of python-logstash-async's feature to store failed events in the sqlite database. Even though we write everything to the filesystem, memory usage is not even comparable to database size (memory usage is much bigger).

To demonstrate that I created this little Django project: https://github.com/xunto-examples/async_logstash_memory_leak

I run start_logstash.sh and start_no_logstash.sh to start service. And tests.sh to start sending requests.

xunto commented 3 years ago

Test results when logstash is not available: logstash

While database size was 71.7MB at the end of testing.

xunto commented 3 years ago

Test result for the same project but without python-logstash-async enabled: no_logstash

xunto commented 3 years ago

I can also confirm that leak doesn't happen anymore if event_ttl parameter is set.

eht16 commented 3 years ago

Thanks for the detailed report! I'm currently quite limited on time, so it might took some time to look into it.

xunto commented 3 years ago

It's ok, we are going to stick with setting event_ttl for now anyway.

eht16 commented 3 years ago

I had a look at it but unfortunately, I'm not able to reproduce it, not even with your great test setup. pla_py37_memory_issue63 As you can see, I had the test setup running for hours and continously firing requests on it but the memory consumption kept at the same level.

What versions of python-logstash-async, django and python are you using? I used the following versions:

(venv) enrico@localhost: ~% python --version
Python 3.7.3
(venv) enrico@localhost: ~% pip list
Package               Version
--------------------- ---------
Django                3.2
memory-profiler       0.58.0
python-logstash-async 2.2.0
...
xunto commented 3 years ago

Hi, sorry for slow response, we used

Django==3.1.7
python-logstash-async==2.2.0

and

Python 3.9.3

I also found out that the REAL reason for RAM leakage was that https://github.com/eht16/python-logstash-async/blob/018d238e3f76b97def3b279b0226d8e326bd3dee/logstash_async/worker.py#L49 were getting big too fast. Not sure we could do anything about it.

Also turned out that event_ttl didn't really help... so we moved to using stdout and fluentd instead of your library. :(

eht16 commented 3 years ago

I also found out that the REAL reason for RAM leakage was that

https://github.com/eht16/python-logstash-async/blob/018d238e3f76b97def3b279b0226d8e326bd3dee/logstash_async/worker.py#L49

were getting big too fast. Not sure we could do anything about it.

Good catch! I was able to reproduce the behaviour by slowing down writing events to the SQLite database. Then the events impound in the Queue and so consume memory.

Is it possible that the storage you used could be too slow to keep up with the high event rate? If so, it would explain the high memory usage.

Also turned out that event_ttl didn't really help... so we moved to using stdout and fluentd instead of your library. :(

Sad to hear. Sorry for not being able to help you in time.

If you still want to try it, you could set the database_path setting to None to use an in-memory database instead of a storage based on. If the above assumption is true, this could help.

eht16 commented 10 months ago

Closing due to missing response. Feel free to re-open if the problem persists.