bloomberg / pytest-memray

pytest plugin for easy integration of memray memory profiler
https://pytest-memray.readthedocs.io/en/latest/
Apache License 2.0
323 stars 23 forks source link

Support flag `--trace-python-allocators` when running the memray pytest plugin #78

Closed godiedelrio closed 1 year ago

godiedelrio commented 1 year ago

Feature Request

Is your feature request related to a problem? Please describe. I've read through the doc about how pymalloc affects profiling results. It seems that when running a test suite using the memray pytest plugin, not all the memory allocations from a test would be reported if the pymalloc allocator finds memory available in the arenas and memory pools that had been previously allocated by the tests in the test suite that ran before. Is this something that might happen?

Describe the solution you'd like In such a case, would help if the pytest memray plugin could be run with the option --trace-python-allocators to always track the calls to the pymalloc memory allocator?

pablogsal commented 1 year ago

not all the memory allocations from a test would be reported

Well, this depends on what you define as "memory allocations". Take into account that in a real program you will not run without pymalloc so you will not allocate memory from the system allocator for each Python object. At the same time, memory is getting allocated when arenas are allocated, is just that this happens in a different fashion.

In any case, I do agree that because no report is intrinsically better than the other, it makes sense to expose this flag.

Are you interested on making a PR? Otherwise @gaborbernat. @godlygeek or myself we can get to this eventually :)

godiedelrio commented 1 year ago

I'm using memray to measure memory allocations produced by different runs of a test in pytest, i.e. the test is run for every parameter value specified with @pytest.mark.parametrize, and I'd like to see how much memory is allocated every time the test is run with different values. As an alternative I'm wrapping the test code using the functions start() and stop() from the tracemalloc module that comes with the python sdk.

class TraceMalloc:
    def __enter__(self):
        self.allocated_memory = None
        self.peak_allocated_memory = None
        tracemalloc.start()
        return self

    def __exit__(self, exec_type, exec_value, exec_traceback):
        current, peak = tracemalloc.get_traced_memory()
        tracemalloc.stop()
        self.allocated_memory = current
        self.peak_allocated_memory = peak

And then

    with TraceMalloc() as trace_malloc:
      # code that generates memory allocation
    print(f"Allocated memory: {trace_malloc.allocated_memory / 1024**2} MiB")

I don't know how this differs in detail from what memray does to measure memory allocations. I believe that the tracemalloc module records every memory allocation request, wether it can be satisfied with memory already available in the python memory pools or memory that should be requested to the OS. Given that each test run is independent from each other, I'm looking for the memory that each test run would allocate, as if it was the only test ran since the python process started.