Closed ckutlu closed 9 months ago
Sorry, I meant to respond to this sooner. I started to draft a reply, realized I had a lot to say, and then got distracted by the holidays and vacations.
Our initial approach to finding leaks was fairly similar to what you propose, but we couldn't find a way to make it work that the team was happy with and could reach a consensus on. Some of the issues are documented in the warnings for the limit_leaks
decorator - every log record will show up as a leak, every allocation made by a fixture and not freed until the fixture is torn down will show up as a leak, every line of output to stdout or stderr will show up as a leak, things cached by the interpreter itself for later reuse could show up as leaks (like freed lists or dicts). Because of those caveats, running existing tests and demanding that they not leak anything is incredibly unlikely to succeed, even with warmup passes.
It seems like an approach like what you're proposing should work, but when we tried it, we found so many unexpected problems that we wound up needing to abandon it. Commit cf05664fb64f0f92dbcf47a8c211531460ded4e4 (tree) was the final version we tried with that approach. You can pick it up and play with it yourself, to try it out or to try to improve upon it. You can see that commit's documentation for how to use the marker as it existed for that commit.
That approach allowed a configurable number of warmup runs of a test function, followed by a real one actually measured by Memray. When we tried using that marker with a real existing test suite, it reported far too many false positive leaks to be of any practical use, even despite code like this file that tried to coerce CPython into dropping all of its caches. There were still caches we couldn't clear, and too many things it found where whether or not they were "leaks" was disputable.
It may be possible to get an approach like you're suggesting working satisfactorily. Before attempting it, though, I strongly suggest you pick up the version that we abandoned and play around with it. Try installing it and using it with some of your own test suites, especially ones that make use of extension modules or non-trivial libraries, and see what leaks it finds, whether they're true or false positives, and whether there's any reasonable way to suppress the false positives it finds. I'd be curious to learn whether you find as many problems with that approach as I did when I tried it.
Thanks a lot @godlygeek for the detailed reply. I was actually fearing the issues you described would pop-up. Sad to hear it didn't work, but happy to hear that it was already given a good try. This is not yet high on my priority list, but if I ever get to it, I will definitely start from the version you pointed at. Cheers!
I just recently discovered this plugin and its awesome :fire:. Kudos to the devs! My request is inspired by having looked briefly at
pytest-leaks
recently. Not sure if it is applicable, but here it goes:Feature Request
Problem statement
In the documentation for
pytest.mark.limit_leaks
the following is noted:If one wants to check leaks of a particular piece of the API or code, it makes total sense to isolate code in the test body and loop over it. However, if one wants to check the test body itself for existing test functions, this would necessitate modification of the test sources.
Possible approach
It would be nice if
pytest-memray
came with a command-line switch to facilitate this. The interface for it can be borrowed from pytest-leaks. Their docs read:A different interface could also be considered, e.g. combining
--memray-warmup
and--memray-repeat
It would still be nice if the docs made it clear that this is for measuring the whole test body and one needs to be careful if testing a particular piece of their code.
Let me know if this make sense, I could also take a jab at it myself.