pythonprofilers / memory_profiler

Monitor Memory usage of Python code
http://pypi.python.org/pypi/memory_profiler
Other
4.39k stars 380 forks source link

Memory leakage with `memory_usage` decorator #332

Open lspinheiro opened 3 years ago

lspinheiro commented 3 years ago

I have been using the memory_usage function to get the peak usage of an ETL job but noticed it actually caused increased memory usage.

It looks like it creates a copy of the inputs to the function being profile causing some duplication in memory. My function simply does some merging and processing of some columns but here is a script very close to my testing:

Sample script to reproduce

import pandas as pd

from my_project.my_module import get_revenue_df
from memory_profiler import memory_usage

if __name__ == "__main__":

   price_df = pd.read_pickle("data/price_df.pkl")
   sales_df = pd.read_parquet(filepath="data/sales.parquet")

    inputs = (
        price_df,
        sales_df,
    )
    mem_usage, result = memory_usage(
        (get_revenue_df, inputs, ),
        interval=0.1,
        include_children=True,
        retval=True,
        max_usage=True,
    )

    print(max(mem_usage))

Without memory_usage

image

With memory_usage

image

Is this a bug or expected behaviour? Is there any way to avoid this?

fabianp commented 3 years ago

Sounds like a bug to me ... unfortunately I don't have a solution, but would gladly take pull requests if you come across a solution or workaround

Lucas-C commented 1 year ago

Hi!

I spotted some consequent memory increase due to memory_profiler in this issue: https://github.com/PyFPDF/fpdf2/issues/641#issuecomment-1465730884 It contains some minimal code to reproduce the problem.

I'm not sure if this is really a leak, or even a bug, but just importing the library increases the RSS memory usage by 15MiB to 25MiB.