discuss best memory measurement approach and possible leak detection

Splitting off from https://github.com/CFMTech/pytest-monitor/pull/38#issuecomment-831420330 to focus the discussion on nuances of memory measurement side of this extension. And also I will bring up some memory leak detection issue, which I obviously don't expect pytest-monitor should support, but perhaps something good will come out of it, just through discussing these issues with knowledgeable devs. I have a real need to try to identify tests with smallish leaks as these leaks add up requiring more RAM than it should. And when running thousands of tests under xdist/multiple pytest workers the memory requirements go up dramatically.

I noticed you aren't doing gc.collect() before taking a new measurement, without which you may get incorrect reports, as most of the time GC doesn't run immediately on python object deletion. At least in all the profilers I have been developing I've been using this approach.

this was split off into a separate issue: https://github.com/CFMTech/pytest-monitor/issues/40
memory_profiler that you use relies on RSS, which I also use mostly but it's not super-reliable if you get memory swapped out. Recently I discovered PSS - proportional set size - via smem via this https://unix.stackexchange.com/a/169129/291728, and it really helped to solve the "process being killed because it used up its cgroups quota" which I used in the analysis of pytest being killed on CI here: https://github.com/huggingface/transformers/issues/11408 (please scroll down to smem discussion in item 4).

The question is how to measure it in python if it appears to be a better metric, like the C-program smem does.
How can we tell how much memory test A used vs test B, if both tests load the same large module. It'd appear that test A used a lot of memory if it was used first, when in reality it could be far from the truth. So if the tests execution order were to be reversed the results would be completely different. This is especially so in envs where tests are loaded randomly by choice or because it's already so (e.g. xdist) - so for example in the context of pytest-monitor if one were to compare different sessions one would get reports of varying memory, but it'd be false reports.

One possible solution is to somehow pre-run all the tests so that all the modules and any globals get pre-loaded and measure only the 2nd time these tests are re-run. This would be problem with pytest-xdist, but let's say that for the sake of regression testing, one could run the full test suite with pytest -n 1 just to ensure that the approach is consistent. There are other ways to make the tests run in the same order, but it only works if no new tests are added.
It'd be great to be able to use such plugin for memory leak detection, in which case the first time any test gets loaded the measurement should be discarded and we would expect 0 extra memory usage for 2nd round (well, sometimes 3rd) if there is no leak. So far I've been struggling using RSS to get that 0, I get fluctuations in reported memory.

I guess this is tightly related to (3) and probably is really the same issue.
Continuing the memory leakage, how could we detect badly written tests, where a test doesn't clean completely up on teardown and leaves hanging objects that may continue consuming significant amounts of memory. This is very difficult to measure, because any test may appear to be such test if it loads some module for the first time.

@js-dieu

CFMTech / pytest-monitor

discuss best memory measurement approach and possible leak detection #41