bloomberg / memray

Memray is a memory profiler for Python
https://bloomberg.github.io/memray/
Apache License 2.0
13.17k stars 392 forks source link

Running gunicorn under memray leaks memory #683

Closed M1ha-Shvn closed 1 week ago

M1ha-Shvn commented 1 week ago

Is there an existing issue for this?

Current Behavior

I run a Django (3.2.11) project using gunicorn (v23.0.0) over WSGI on python 3.8.12. Versions can not be easily updated as a result of incompatible dependency issues.

Sometimes (rarely, 1 time a week) something is consuming all servers memory abruptly and server is ran out of memory. That runs very fast - a minute between start of memory reducing an server fault. My task is to debug what it is. I guess that some django view is making request to database and unexpectedly loads lots of data for some reason. I can't reproduce it anywhere except for production servers with big traffic.

What I've done, is placing memray in my shell binary like so:

# I added date, so the process doesn't fail after workers reboot with "file already exists"
DATE=$(date +"%Y%m%d%H%M")
exec memray3.8 run --follow-fork --output /tmp/memray_capture_$DATE.bin \
    /home/ubuntu2/venv/bin/gunicorn cqapi.wsgi:application \
    --pid /home/ubuntu2/gunicorn_main_memray.pid \
    --worker-class=sync \
    --workers=13 \
    --threads=1 \
    --log-level=info \
    --keep-alive=20 \
    --max-requests=200000 \
    --max-requests-jitter=10000 \
    --preload \
    --reuse-port \
    --timeout=180 \
    --graceful-timeout=8 \
    --statsd-host=172.16.0.10:8125 \
    --statsd-prefix=ru \
    --bind=unix:/tmp/gunicorn_main_memray.sock

It ran successfully and started generated files. But it also consumed lots of memory in just an hour of working: image

P. s. the subsequent memory reduce after it has been disabled is not connected with memray, is was just the low traffic start of the days when I started experiments. You can see, that it reduces less fast than when memray has been active.

So I guessed, that memray somewhere leaks memory in this mode and desided to write an issue here.

Expected Behavior

I expect process to run under memray for a long time and after memory failure analyze results in /tmp/memory_capture_<date>.bin files without memory failures, provided by memray itself.

Steps To Reproduce

  1. Run django app as written above
  2. Monitor memory usage

Memray Version

1.14.0

Python Version

3.8

Operating System

Linux

Anything else?

No response

pablogsal commented 1 week ago

We have made some investigation into this and we are pretty sure we are not leaking memory (we have checked with valgrind, several memory profilers, looking directly into /proc/PID/maps. By monitoring /proc/meminfo we have confirmed that the memory that's raising up is kernel caches as all of the extra memory raising reported by free is sitting in "cached" and "inactive" according to meminfo:

Cached:          5095692 kB  |  Cached:         10415548 kB
Inactive:        5242180 kB  |  Inactive:       10582984 kB
Inactive(file):  4068928 kB  |  Inactive(file):  9388420 kB
pablogsal commented 1 week ago

Also, note that writing to echo 3 | sudo tee /proc/sys/vm/drop_cache clears all that right up.