Closed tonybaloney closed 12 months ago
Who memory profiles the memory profilers?
Well, to my immense surprise, I can reproduce this. I'll need to spend some time figuring out what is going on. Our working set size shouldn't be nearly this large, so clearly more is being kept alive from one test to the next than we intended.
Some initial notes:
/proc/$pid/maps
, which is... a lot. We've got many copies of every shared library loaded in memory, presumably from either the unwinding or symbolifying that we need to do to support native traces...So, yep, real issue. We're leaking the state associated with libbacktrace's unwinding for each test case. We have to leak it - the API of libbacktrace is designed so that's the only option, surprisingly: https://github.com/bloomberg/memray/blob/98032ae8eb23c16e8fbc3067efbd09a8b07b36ba/src/vendor/libbacktrace/backtrace.h#L81-L85
But, we ought to be reusing it from one test case to the next, instead of leaking it between each test and creating a new one for the next test. This has been wrong for a long time, but we never noticed because only pytest-memray
reads many different capture files during the lifetime of the program - it's not possible to trigger this misbehavior through the public Memray APIs or the memray
CLI entry point, because they only ever operate on one single capture file.
It's easy enough to fix - we just need to move some state to be explicitly global rather than recreating it each time a new capture file is read, and wrap a lock around it. While looking at the surrounding code I spotted some other things that I don't like, though, so I'm gonna try to clean up a few different things here...
So, yep, real issue
Is it wrong that I breathed a sigh of relief?
Thanks for figuring it out. I'll update and retest when there's a patch.
@tonybaloney the latest release, includes a bug fix for this. Hopefully it works this time. 😊
Bug Report
I've recently tried combining flaky and pytest-memray with the leak analysers. On GitHub Actions, the pytest process was being issued SIGKILL from the kernel. Which is weird.
The behaviour is that when pytest is running, the memory usage slowly escalates (toward 10GBi or the machine runs out of memory).
So I looked and looked at which tests are leaking memory. Yes, there were some. But then it crossed my mind to take my library out of the equation and just test Stdlib modules instead.
Yes, I appreciate this is a ridiculous issue to raise. But I think that either memray or flaky is leaking significant memory. I might be totally wrong or doing something silly in these tests and we can all ignore this issue :-)
Current Behavior A clear and concise description of the behavior.
Input Code
Running this code with memray, natives, trace python allocators and forcing flaky runs for 9-10 runs:
Also uses significant memory but not the same total amount.
I'm not able to run my entire test suite because it gets killed by OOMKiller.