A CLI interface for looping over test body

ckutlu commented 11 months ago

I just recently discovered this plugin and its awesome :fire:. Kudos to the devs! My request is inspired by having looked briefly at pytest-leaks recently. Not sure if it is applicable, but here it goes:

Feature Request

Problem statement

In the documentation for pytest.mark.limit_leaks the following is noted:

It’s recommended to run your API or code in a loop when utilizing this plugin. This practice helps in distinguishing genuine leaks from the “noise” generated by internal caches and other incidental allocations.

If one wants to check leaks of a particular piece of the API or code, it makes total sense to isolate code in the test body and loop over it. However, if one wants to check the test body itself for existing test functions, this would necessitate modification of the test sources.

Possible approach

It would be nice if pytest-memray came with a command-line switch to facilitate this. The interface for it can be borrowed from pytest-leaks. Their docs read:

leaks:
  -R LEAKS, --leaks=LEAKS
                        runs each test several times and examines
                        sys.gettotalrefcount() to see if the test appears to
                        be leaking references. The argument should be of the
                        form stab:run where 'stab' is the number of times the
                        test is run to let gettotalrefcount settle down, 'run'
                        is the number of times further it is run. These
                        parameters all have defaults (5 and 4, respectively),
                        and the minimal invocation is '-R :'.

A different interface could also be considered, e.g. combining --memray-warmup and --memray-repeat

It would still be nice if the docs made it clear that this is for measuring the whole test body and one needs to be careful if testing a particular piece of their code.

Let me know if this make sense, I could also take a jab at it myself.

godlygeek commented 11 months ago

Sorry, I meant to respond to this sooner. I started to draft a reply, realized I had a lot to say, and then got distracted by the holidays and vacations.

Our initial approach to finding leaks was fairly similar to what you propose, but we couldn't find a way to make it work that the team was happy with and could reach a consensus on. Some of the issues are documented in the warnings for the limit_leaks decorator - every log record will show up as a leak, every allocation made by a fixture and not freed until the fixture is torn down will show up as a leak, every line of output to stdout or stderr will show up as a leak, things cached by the interpreter itself for later reuse could show up as leaks (like freed lists or dicts). Because of those caveats, running existing tests and demanding that they not leak anything is incredibly unlikely to succeed, even with warmup passes.

It seems like an approach like what you're proposing should work, but when we tried it, we found so many unexpected problems that we wound up needing to abandon it. Commit cf05664fb64f0f92dbcf47a8c211531460ded4e4 (tree) was the final version we tried with that approach. You can pick it up and play with it yourself, to try it out or to try to improve upon it. You can see that commit's documentation for how to use the marker as it existed for that commit.

That approach allowed a configurable number of warmup runs of a test function, followed by a real one actually measured by Memray. When we tried using that marker with a real existing test suite, it reported far too many false positive leaks to be of any practical use, even despite code like this file that tried to coerce CPython into dropping all of its caches. There were still caches we couldn't clear, and too many things it found where whether or not they were "leaks" was disputable.

It may be possible to get an approach like you're suggesting working satisfactorily. Before attempting it, though, I strongly suggest you pick up the version that we abandoned and play around with it. Try installing it and using it with some of your own test suites, especially ones that make use of extension modules or non-trivial libraries, and see what leaks it finds, whether they're true or false positives, and whether there's any reasonable way to suppress the false positives it finds. I'd be curious to learn whether you find as many problems with that approach as I did when I tried it.

ckutlu commented 10 months ago

Thanks a lot @godlygeek for the detailed reply. I was actually fearing the issues you described would pop-up. Sad to hear it didn't work, but happy to hear that it was already given a good try. This is not yet high on my priority list, but if I ever get to it, I will definitely start from the version you pointed at. Cheers!

bloomberg / pytest-memray