Memory leak #501

Open jmontoyam opened 4 years ago

jmontoyam commented 4 years ago


first of all, thank you very much for this amazing project! ;). I think I have detected a possible memory leak. In my use case, I use a function that generates a big xarray DataArray, and another function that receives as input such big xarray DataArray an generates an image from it using hvplot.image(rasterize=True). I use jupyterlab to interact with such functions and visualize the generated image. I have noticed that every time I re-execute the code cell containing both functions, the memory usage keeps growing and growing. Please see the memory usage showed in the attached images (top right corner):

First execution of the code cell: execution1

Second execution of the code cell: execution2

Third execution of the code cell: execution3

I tried the suggestion given by @philippjfr in holoviz/holoviews#1821 (%reset out) but it did not work.

Expected behavior: I expected the memory usage to keep constant between re-execution of the same code cell (the memory occupied but the data generated in the previous execution is supposed to be garbage collected), am I right?, or am I missing something?.

Software info:

hvplot: 0.6.0 holoviews: 1.13.3 datashader: 0.11.0 bokeh: 2.1.1

jupyter core : 4.6.3 jupyter-notebook : 6.0.3 qtconsole : 4.7.5 ipython : 7.16.1 ipykernel : 5.3.0 jupyter client : 6.1.3 jupyter lab : 2.1.5 nbconvert : 5.6.1 ipywidgets : 7.5.1 nbformat : 5.0.7 traitlets : 4.3.3

JupyterLab v2.1.5 Known labextensions: @bokeh/jupyter_bokeh v2.0.2 enabled OK @jupyter-widgets/jupyterlab-manager v2.0.0 enabled OK @pyviz/jupyterlab_pyviz v1.0.4 enabled OK

Google Chrome Version 84.0.4147.125 (Official Build) (64-bit)

OS Ubuntu 18.04.4 LTS

Thank you very much for all your help! ;)

jmakov commented 2 years ago

Same with df[df.columns[0]].hvplot(datashade=True) - memory increases on every execution (hvplot=0.7.3). Cannot use %reset out (causes an exception) because following https://docs.ray.io/en/latest/using-ray-with-jupyter.html?highlight=jupyter.

conda's env.yaml:

jmakov commented 1 year ago

Bump. I have to restart the kernel all the time in the notebook. Any suggestions very welcome (gc.collect() doesn't change anything).

maximlt commented 1 year ago

Hi @jmakov,

Could you provide:

Reproducing and tracking down memory leaks is notoriously difficult :)

jmakov commented 1 year ago

@maximlt thanks for the quick response! I can reproduce this min example below in jupyter-lab:

import pandas
pandas.options.plotting.backend = "holoviews"  # `datashade=True` doesn't work without this line

# I'm gonna eat about 5GB of your memory and won't give it back :)
df = pandas.DataFrame({"col1": range(0, 100_000_000)})

# output includes this warnings
# WARNING:param.datashade: Parameter(s) [line_width] not consumed by any element rasterizer.
# WARNING:param.datashade: Parameter(s) [line_width] not consumed by any element rasterizer.

OS: Ubuntu 22.04.1 LTS uname info: Linux 5.15.0-56-generic 62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 GNU/Linux conda_list_output.txt

jmakov commented 1 year ago

Would just like to bump this issue a bit since it's almost a blocker for me - imagine a long computation, then trying to plot with diff params, each plot eats memory, and you have to restart the kernel (and again wait for the computation).

tomerroditi commented 1 year ago

I'm experiencing the same issue, ram is piling up after every cell execution. minimal reproducible code:

import pandas as pd
import numpy as np
import hvplot.pandas

df = pd.DataFrame(np.random.rand(1000000,100), columns=[str(i) for i in range(100)])
plots = df.hvplot.hist('0')

installed packages:

GitPython==3.1.29 holoviews==1.15.3 joblib==1.2.0 mat73==0.60 matplotlib==3.6.2 numpy==1.23.0 pandas==1.5.2 PyAutoGUI==0.9.53 pytest==7.2.0 scikit-learn==1.2.1 scipy==1.9.3 seaborn==0.12.1 shapely==2.0.0 sktime==0.15.0 tqdm==4.64.1 tsfel==0.1.4 tsfresh==0.19.0 xgboost==1.7.2 h5py==3.8.0 lightgbm==3.3.4 sklearn==0.0.post1 bokeh==2.4.3

jmakov commented 8 months ago

What's the status of this after half a year?

maximlt commented 8 months ago

Hi @jmakov , I'll investigate this week and see what I can find. Feel all free to investigate too, memory leaks aren't the easiest thing to debug :)