astropy / halotools

Python package for studying large scale structure, cosmology, and galaxy evolution using N-body simulations and halo models
http://halotools.rtfd.org
102 stars 63 forks source link

Severe memory leak in `populate_mock()` #917

Open rainwoodman opened 6 years ago

rainwoodman commented 6 years ago

Run the following test case and we observe the RSS size increase continuously. There is no apparent reason for this to happen reading from the face value of the code.

def test_hearin15():
    """
    """
    model = PrebuiltHodModelFactory('hearin15')
    try:
        halocat = CachedHaloCatalog()
    except:
        halocat = FakeSim()
    import resource
    for i in range(100):
        model.populate_mock(halocat)
        maxrss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
        print(maxrss)

    assert maxrss < 1000000

Result

../testenv/lib/python3.6/site-packages/halotools/empirical_models/composite_models/tests/test_preloaded_models.py 169816
185612
199336
216228
230180
230180
230180
241580
257364
276064
286356
291108
302724
315828
330084
345660
359916
...
aphearin commented 6 years ago

I cannot reproduce your memory leak results using python 2x, @rainwoodman. I have not yet tried python 3.x but will try this soon.

I can say that the populate_mock function was not intended to be used this way. That is a one-time-per-simulation function. After you have called it once, repeated calls on the same simulation should be done using model.mock.populate(). Since I cannot reproduce your memory leak test, could you report the results of modifying your function so that populate_mock is called prior to the loop, and then model.mock.populate is called inside the loop?

It is also worth trying to see whether simply forcing garbage collection resolves the issue, which is what @johannesulf and @surhudm found in the closely related #568.

rainwoodman commented 6 years ago

Sorry I was testing this with Python 3.6, actually. This is an updated script, which uses mock.populate() the same behavior unless gc is invoked. This indicates there are circular references in the data model. I think if we get rid of the circular references, this can be fixed without requiring triggering gc. Circular references used to be considered as bugs in the old days.

def test_hearin15():
    """
    """
    model = PrebuiltHodModelFactory('hearin15')
    halocat = FakeSim()
    import resource
    import gc
    model.populate_mock(halocat)
    for i in range(100):
        model.mock.populate()
        # doesn't matter if model.populate_mock(halocat)
        # gc.collect() # this will cure it.

        maxrss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
        print(maxrss)

    assert maxrss < 1000000

and here is my conda environment list:

(base) [yfeng1@waterfall halotools]$ !vim
vim halotools/empirical_models/composite_models/tests/test_preloaded_models.py
(base) [yfeng1@waterfall halotools]$ conda list
# packages in environment at /home/yfeng1/anaconda3/install:
#
# Name                    Version                   Build  Channel
_ipyw_jlab_nb_ext_conf    0.1.0            py36he11e457_0  
abopt                     0.0.13                    <pip>
alabaster                 0.7.10           py36h306e16b_0  
anaconda                  custom           py36hbbc8b67_0  
anaconda-client           1.6.14                   py36_0  
anaconda-navigator        1.8.7                    py36_0  
anaconda-project          0.8.2            py36h44fb852_0  
apptools                  4.4.0                    py36_0  
argh                      0.26.2                    <pip>
asn1crypto                0.24.0                   py36_0  
astroid                   1.6.4                    py36_0  
astropy                   3.0.3            py36h14c3975_2  
asv                       0.2.1                     <pip>
atomicwrites              1.1.5                    py36_0  
attrs                     18.1.0                   py36_0  
autograd                  1.1.13                    <pip>
babel                     2.6.0                    py36_0  
backcall                  0.1.0                    py36_0  
backports                 1.0              py36hfa02d7e_1  
backports.shutil_get_terminal_size 1.0.0            py36hfea85ff_2  
beautifulsoup4            4.6.0            py36h49b8c8c_1  
bigfile                   0.1.40           py36h3010b51_4    bccp
binutils_impl_linux-64    2.28.1               had2808c_3  
binutils_linux-64         7.2.0               had2808c_27  
bitarray                  0.8.2            py36h14c3975_0  
bkcharts                  0.2              py36h735825a_0  
blas                      1.0                         mkl  
blaze                     0.11.3           py36h4e06776_0  
bleach                    2.1.3                    py36_0  
blinker                   1.4                       <pip>
blinker                   1.4                      py36_0  
blosc                     1.14.3               hdbcaa40_0  
bokeh                     0.12.16                  py36_0  
boto                      2.48.0           py36h6e4cd66_1  
bottleneck                1.2.1            py36haac1ea0_0  
bzip2                     1.0.6                h14c3975_5  
ca-certificates           2018.03.07                    0  
cachey                    0.1.1            py36hd6fb11b_0    bccp
cairo                     1.14.12              h7636065_2  
certifi                   2018.4.16                py36_0  
cffi                      1.11.5           py36h9745a5d_0  
chardet                   3.0.4            py36h0f667ec_1  
chealpy                   0.1.0                     <pip>
classylss                 0.2.8            py36h6091dcd_6    bccp
click                     6.7              py36h5253387_0  
cloog                     0.18.0                        0  
cloudpickle               0.5.3                    py36_0  
clyent                    1.2.2            py36h7e57e65_1  
colorama                  0.3.9            py36h489cec4_0  
conda                     4.5.4                    py36_0  
conda-build               3.10.7                   py36_0  
conda-env                 2.6.0                h36134e3_1  
conda-verify              2.0.0            py36h98955d8_0  
configobj                 5.0.6                     <pip>
configobj                 5.0.6                    py36_0  
contextlib2               0.5.5            py36h6c84a62_0  
corrfunc                  2.0.0            py36h3010b51_4    bccp
cosmo4d                   0.0.0                     <pip>
coverage                  4.5.1            py36h14c3975_0  
cryptography              2.2.2            py36h14c3975_0  
curl                      7.60.0               h84994c4_0  
cycler                    0.10.0           py36h93f1223_0  
cython                    0.28.3           py36h14c3975_0  
cytoolz                   0.9.0.1          py36h14c3975_0  
dask                      0.18.0                   py36_0  
dask-core                 0.18.0                   py36_0  
datashape                 0.5.4            py36h3ad6b5c_0  
dbus                      1.13.2               h714fa37_1  
decorator                 4.3.0                    py36_0  
distributed               1.22.0                   py36_0  
docrep                    0.2.3                     <pip>
docutils                  0.14             py36hb0f60f5_0  
doit                      0.31.1                    <pip>
entrypoints               0.2.3            py36h1aec115_2  
envisage                  4.6.0            py36h49a624e_0  
et_xmlfile                1.0.1            py36hd6bccc3_0  
expat                     2.2.5                he0dffb1_0  
fake_spectra              1.2.1            py36h637b7d7_1    bccp
fakepkg                   1.0                  h200af0e_0    local
fastcache                 1.0.2            py36h14c3975_2  
fastpm                    0.0.8                     <pip>
filelock                  3.0.4                    py36_0  
fitsio                    0.9.11           py36h3010b51_4    bccp
flake8                    3.5.0                    py36_1  
flask                     1.0.2                    py36_1  
flask-cors                3.0.4                    py36_0  
fontconfig                2.12.6               h49f89f6_0  
freetype                  2.8                  hab7d2ae_1  
future                    0.16.0                   py36_1  
gaepsi2                   0.1.0                     <pip>
gcc_impl_linux-64         7.2.0                habb00fd_3  
gcc_linux-64              7.2.0               h550dcbe_27  
get_terminal_size         1.0.0                haa9412d_0  
gevent                    1.3.2.post0      py36h14c3975_0  
gfortran_impl_linux-64    7.2.0                hdf63c60_3  
gfortran_linux-64         7.2.0               h550dcbe_27  
ghp-import                0.5.5                     <pip>
gitdb2                    2.0.3                    py36_0  
gitpython                 2.1.10                   py36_0  
glib                      2.56.1               h000015b_0  
glob2                     0.6              py36he249c77_0  
gmp                       6.1.2                h6c8ec71_1  
gmpy2                     2.0.8            py36hc8893dd_2  
graphite2                 1.3.11               h16798f4_2  
graphviz                  0.8.1                     <pip>
graphviz                  2.40.1               h25d223c_0  
greenlet                  0.4.13           py36h14c3975_0  
gsl                       2.4                  h14c3975_4  
gst-plugins-base          1.14.0               hbbd80ab_1  
gstreamer                 1.14.0               hb453b48_1  
gxx_impl_linux-64         7.2.0                hdf63c60_3  
gxx_linux-64              7.2.0               h550dcbe_27  
h5py                      2.8.0            py36hca9c191_0  
halotools                 0.6              py36h637b7d7_4    bccp
harfbuzz                  1.7.6                h5f0a787_1  
hdf5                      1.8.18               h6792536_1  
healpy                    1.11.0                   py36_1    conda-forge
heapdict                  1.0.0                    py36_2  
hfof                      0.0.0                     <pip>
html5lib                  1.0.1            py36h2f9c1c0_0  
hypothesis                3.57.0           py36h24bf2e0_0  
icu                       58.2                 h9c2bf20_1  
idna                      2.6              py36h82fb2a8_1  
imageio                   2.3.0                    py36_0  
imagesize                 1.0.0                    py36_0  
imaginglss                0.1.0rc0                  <pip>
intel-openmp              2018.0.3                      0  
ipykernel                 4.8.2                    py36_0  
ipyparallel               6.2.1                    py36_0  
ipython                   6.4.0                    py36_0  
ipython_genutils          0.2.0            py36hb52b0d5_0  
ipywidgets                7.2.1                    py36_0  
isl                       0.12.2                        0  
isort                     4.3.4                    py36_0  
itsdangerous              0.24             py36h93cc618_1  
jbig                      2.1                  hdba287a_0  
jdcal                     1.4                      py36_0  
jedi                      0.12.0                   py36_1  
jinja2                    2.10             py36ha16c418_0  
jpeg                      9b                   h024ee3a_2  
jsoncpp                   1.8.3                h3a67955_0  
jsonschema                2.6.0            py36h006f8b5_0  
jupyter                   1.0.0                    py36_4  
jupyter_client            5.2.3                    py36_0  
jupyter_console           5.2.0            py36he59e554_1  
jupyter_core              4.4.0            py36h7c827e3_0  
jupyterlab                0.32.1                   py36_0  
jupyterlab_launcher       0.10.5                   py36_0  
kdcount                   0.3.27           py36h3010b51_3    bccp
kiwisolver                1.0.1            py36h764f252_0  
lazy-object-proxy         1.3.1            py36h10fcdad_0  
libcurl                   7.60.0               h1ad7b7a_0  
libedit                   3.1.20170329         h6b74fdf_2  
libffi                    3.2.1                hd88cf55_4  
libgcc                    7.2.0                h69d50b8_2  
libgcc-ng                 7.2.0                hdf63c60_3  
libgfortran-ng            7.2.0                hdf63c60_3  
libiconv                  1.15                 h63c8f33_5  
libopenblas               0.2.20               h9ac9557_7  
libpng                    1.6.34               hb9fc6fc_0  
libsodium                 1.0.16               h1bed415_0  
libssh2                   1.8.0                h9cfc8f7_4  
libstdcxx-ng              7.2.0                hdf63c60_3  
libtiff                   4.0.9                he85c1e1_1  
libtool                   2.4.6                h544aabb_3  
libxcb                    1.13                 h1bed415_1  
libxml2                   2.9.8                h26e45fe_1  
libxslt                   1.1.32               h1312cb7_0  
livereload                2.5.1                     <pip>
llvmlite                  0.23.1           py36hdbcaa40_0  
locket                    0.2.0            py36h787c0ad_1  
Logbook                   1.4.0                     <pip>
lxml                      4.2.1            py36h23eabaa_0  
lzo                       2.10                 h49e0be7_2  
make                      4.2.1                h1bed415_1  
Mako                      1.0.7                     <pip>
mako                      1.0.7            py36h0727276_0  
Markdown                  2.6.11                    <pip>
markupsafe                1.0              py36hd9260cd_1  
matplotlib                2.2.2            py36h0e671d2_1  
mayavi                    4.5.0                     <pip>
mccabe                    0.6.1            py36h5ad9710_1  
mcfit                     0.0.8            py36h24bf2e0_0    bccp
mistune                   0.8.3            py36h14c3975_1  
mkdocs                    0.17.2                    <pip>
mkl                       2018.0.3                      1  
mkl-service               1.1.2            py36h17a0993_4  
mkl_fft                   1.0.1            py36h3010b51_0  
mkl_random                1.0.1            py36h629b387_0  
mock                      2.0.0            py36h3c5bf6c_0  
more-itertools            4.2.0                    py36_0  
mpc                       1.0.3                hec55b23_5  
mpfr                      3.1.5                h11a74b3_2  
mpi4py                    3.0.0           py36_mpich2h49e5514_7  [mpich2]  bccp
mpi4py                    3.0.0                     <pip>
mpich2                    1.4.1p1              h1c2f66e_6    bccp
mpmath                    1.0.0            py36hfeacd6b_2  
mpsort                    0.1.12          py36_mpich2hf3c02ab_4  [mpich2]  bccp
msgpack-python            0.5.6            py36h6bb024c_0  
multipledispatch          0.5.0                    py36_0  
natsort                   5.3.2                     <pip>
navigator-updater         0.2.1                    py36_0  
nbconvert                 5.3.1            py36hb41ffb7_0  
nbformat                  4.4.0            py36h31c9010_0  
nbodykit                  0.3.2.dev0                <pip>
nbodykit                  0.3.3            py36h24bf2e0_3    bccp
nbodykit                  0.3.1.dev0                <pip>
nbodykit                  0.3.4.dev0                <pip>
nbsphinx                  0.3.3                     <pip>
ncurses                   6.1                  hf484d3e_0  
networkx                  2.1                      py36_0  
Nikola                    7.8.15                    <pip>
nltk                      3.3.0                    py36_0  
nose                      1.3.7            py36hcdf7029_2  
notebook                  5.5.0                    py36_0  
numba                     0.38.0           py36h637b7d7_0  
numexpr                   2.6.5            py36h7bf3b9c_0  
numpy                     1.14.3           py36h28100ab_2  
numpy-base                1.14.3           py36hdbf6ddf_2  
numpydoc                  0.8.0                    py36_0  
odo                       0.5.1            py36h90ed295_0  
olefile                   0.45.1                   py36_0  
openpyxl                  2.5.3                    py36_0  
openssl                   1.0.2o               h20670df_0  
packaging                 17.1                     py36_0  
pandas                    0.23.0           py36h637b7d7_0  
pandoc                    1.19.2.1             hea2e7c5_1  
pandocfilters             1.4.2            py36ha6701b7_1  
pango                     1.41.0               hd475d92_0  
parso                     0.2.1                    py36_0  
partd                     0.3.8            py36h36fd896_0  
patchelf                  0.9                  hf79760b_2  
path.py                   11.0.1                   py36_0  
pathlib2                  2.3.2                    py36_0  
pathtools                 0.1.2                     <pip>
patsy                     0.5.0                    py36_0  
pbr                       4.0.3                    py36_0  
pcre                      8.42                 h439df22_0  
pep8                      1.7.1                    py36_0  
pexpect                   4.6.0                    py36_0  
pfft-python               0.1.20          py36_mpich2hf3c02ab_4  [mpich2]  bccp
pickleshare               0.7.4            py36h63277f8_0  
piexif                    1.0.13                    <pip>
pillow                    5.1.0            py36h3deb7b8_0  
pip                       10.0.1                   py36_0  
pixman                    0.34.0               hceecf20_3  
pkginfo                   1.4.2                    py36_1  
pluggy                    0.6.0            py36hb689045_0  
ply                       3.11                     py36_0  
pmesh                     0.1.42                    <pip>
pmesh                     0.1.45                    <pip>
pmesh                     0.1.45           py36h3010b51_3    bccp
pmesh                     0.1.46                    <pip>
prompt_toolkit            1.0.15           py36h17d85b1_0  
psutil                    5.4.5            py36h14c3975_0  
ptyprocess                0.5.2            py36h69acd42_0  
py                        1.5.3                    py36_0  
pycodestyle               2.3.1            py36hf609f19_0  
pycosat                   0.6.3            py36h0a5515d_0  
pycparser                 2.18             py36hf9f622e_1  
pycrypto                  2.6.1            py36h14c3975_8  
pycurl                    7.43.0.2         py36hb7f436b_0  
pyface                    6.0.0                    py36_0  
pyflakes                  1.6.0            py36h7bd6a15_0  
pygments                  2.2.0            py36h0d3125c_0  
pyinotify                 0.9.6                     <pip>
pyinotify                 0.9.6                    py36_0  
pylint                    1.9.1                    py36_0  
pympler                   0.5              py36h6b12e4d_0  
pyodbc                    4.0.23           py36hf484d3e_0  
pyopenssl                 18.0.0                   py36_0  
pyparsing                 2.2.0            py36hee85983_1  
pyqt                      5.9.2            py36h751905a_0  
PyRSS2Gen                 1.1                       <pip>
pysocks                   1.6.8                    py36_0  
pytables                  3.4.3            py36h68a8fdc_2  
pytest                    3.6.0                    py36_1  
pytest-arraydiff          0.2                      py36_0  
pytest-astropy            0.4.0                    py36_0  
pytest-doctestplus        0.1.3                    py36_0  
pytest-openfiles          0.3.0                    py36_0  
pytest-remotedata         0.3.0                    py36_0  
pytest-runner             4.2                      py36_0  
python                    3.6.5                hc3d631a_2  
python-dateutil           2.7.3                    py36_0  
python-markdown-math      0.3                       <pip>
python-mpi-bcast          0.1.2           mpich2h49e5514_1  [mpich2]  bccp
pytorch-cpu               0.4.0           py36_mpich2hf7077ae_1  [mpich2]  local
pytz                      2018.4                   py36_0  
pywavelets                0.5.2            py36he602eb0_0  
pyyaml                    3.12             py36hafb9ca4_1  
pyzmq                     17.0.0           py36h14c3975_0  
qt                        5.9.5                h7e424d6_0  
qtawesome                 0.4.4            py36h609ed8c_0  
qtconsole                 4.3.1            py36h8f73b5b_0  
qtpy                      1.4.2                    py36_0  
readline                  7.0                  ha6073c6_4  
requests                  2.18.4           py36he2e5f8d_1  
requests-toolbelt         0.8.0                     <pip>
rope                      0.10.7           py36h147e2ec_0  
ruamel_yaml               0.15.37          py36h14c3975_2  
runtests                  0.0.24                    <pip>
runtests                  0.0.26                   py36_0    bccp
scikit-image              0.13.1           py36h14c3975_1  
scikit-learn              0.19.1           py36h7aa7ec6_0  
scipy                     1.1.0            py36hfc37229_0  
seaborn                   0.8.1            py36hfad7ec4_0  
send2trash                1.5.0                    py36_0  
setuptools                38.2.3                    <pip>
setuptools                39.2.0                   py36_0  
sharedmem                 0.3.5                     <pip>
simplegeneric             0.8.1                    py36_2  
singledispatch            3.4.0.3          py36h7a266c3_0  
sip                       4.19.8           py36hf484d3e_0  
six                       1.11.0           py36h372c433_1  
smmap2                    2.0.3                    py36_0  
snappy                    1.1.7                hbae5bb6_3  
snowballstemmer           1.2.1            py36h6febd40_0  
sortedcollections         1.0.1                    py36_0  
sortedcontainers          2.0.3                    py36_0  
sphinx                    1.7.5                    py36_0  
sphinx-bootstrap-theme    0.6.5                     <pip>
sphinxcontrib             1.0              py36h6d0f590_1  
sphinxcontrib-websupport  1.0.1            py36hb5cb234_1  
spyder                    3.2.8                    py36_0  
sqlalchemy                1.2.8            py36h14c3975_0  
sqlite                    3.23.1               he433501_0  
statsmodels               0.9.0            py36h3010b51_0  
sympy                     1.1.1            py36hc6d1c1c_0  
tbb                       2018.0.4             h6bb024c_1  
tblib                     1.3.2            py36h34cf8b6_0  
terminado                 0.8.1                    py36_1  
testpath                  0.3.1            py36h8cadb63_0  
tk                        8.6.7                hc745277_3  
toolz                     0.9.0                    py36_0  
tornado                   5.0.2                    py36_0  
tqdm                      4.19.4                    <pip>
traitlets                 4.3.2            py36h674d592_0  
traits                    4.6.0                    py36_0  
traitsui                  6.0.0                    py36_1  
twine                     1.9.1                     <pip>
typing                    3.6.4                    py36_0  
unicodecsv                0.14.1           py36ha668878_0  
Unidecode                 1.0.22                    <pip>
unixodbc                  2.3.6                h1bed415_0  
urllib3                   1.22             py36hbe7ace6_0  
vtk                       8.1.0           py36h9686630_201  
watchdog                  0.8.3                     <pip>
wcwidth                   0.1.7            py36hdf4376a_0  
webassets                 0.12.1                    <pip>
webencodings              0.5.1            py36h800622e_1  
werkzeug                  0.14.1                   py36_0  
wheel                     0.31.1                   py36_0  
widgetsnbextension        3.2.1                    py36_0  
wrapt                     1.10.11          py36h28b7045_0  
ws4py                     0.5.1                    py36_0  
xarray                    0.10.6                   py36_0  
xlrd                      1.1.0            py36h1db9f0c_1  
xlsxwriter                1.0.5                    py36_0  
xlwt                      1.3.0            py36h7b00a1f_0  
xz                        5.2.4                h14c3975_4  
yaml                      0.1.7                had09818_2  
Yapsy                     1.11.223                  <pip>
zeromq                    4.2.5                h439df22_0  
zict                      0.1.3            py36h3a3bf81_0  
zlib                      1.2.11               ha838bed_2  
zope                      1.0                      py36_0  
zope.interface            4.5.0            py36h14c3975_0  
aphearin commented 6 years ago

@rainwoodman - since there is a perfectly good workaround, can you suggest how to make this explicit call to garbage collect more discoverable in the documentation? Resolving this memory leak could be quite involved, and most of my effort on Halotools now is geared towards Issue #825, so that Halotools scales with the number of available nodes.

rainwoodman commented 6 years ago

It's alright. I will give this a go. I may need to introduce some weak references. I'll comment on 825.

rainwoodman commented 6 years ago

I can confirm updating astropy to the master branch (included 6277) fixed bulk of memory leak.

167496
167656
167656
167656
167656
167732
167732
167732

918 further reduced this to a more conservative growth:

169148
169148
169148
169148
169148
169148
169148
169148
169148
johannesulf commented 6 years ago

I made a recent test with python3.6, numpy1.14.5, astropy3.0.4 (included 6277), halotools0.6 and the script above. Even without #918, I don't have a memory leak. However, there seems to be a consistent memory leak (even when calling gc.collect and #918) when using numpy 1.15. With numpy1.14.5 there doesn't seem to be a problem. It's probably unrelated to the commit here and maybe not caused by halotools directly. But I just wanted to mention it because it's related.