Closed pomalley closed 5 years ago
bump:
@amitsv1 pointed out a numpy issue that may be the cause of the problem, namely a memory leak in numpy.loadtxt
: https://github.com/numpy/numpy/issues/651
I have a branch that eliminates the call to numpy.loadtxt
but for now am unable to repro the issue: https://github.com/martinisgroup/servers/tree/u/maffoo/fix-datavault-leak
I'm able to reproduce the results in the top comment of numpy/numpy#651 (RSS is ~2GB after del arr and gc.collect). Replacing the loadtxt with your code in fix-datavault-leak also results in leaked memory (500MB RSS after doing del arr and collect... different dtype?). My desktop is running Python 2.7.6-8ubuntu0.2 and numpy 1:1.8.2-0ubuntu0.1 on Ubuntu 14.04.3 (identical software configuration to skynet). I have not tried reproducing the leak within the data vault code.
I pushed a branch u/ejeffrey/dv_log_memory that adds log statements about the virtual size and resident size every time a dataset is opened/created/closed. This should let us see when the offending memory allocation happens. Until we have better logging (see pylabrad issue https://github.com/labrad/pylabrad/issues/156 ) just run the datavault from the command line with the --auto parameter and redirect output to a logfile somewhere
$ data_vault_multihead.py --auto > /log/file.txt
On Mon, Aug 31, 2015 at 3:05 PM, Amit Vainsencher notifications@github.com wrote:
I'm able to reproduce the results in the top comment of numpy/numpy#651 https://github.com/numpy/numpy/issues/651 (RSS is ~2GB after del arr). Replacing the loadtxt with your code in fix-datavault-leak also results in leaked memory (500MB RSS after doing del arr... different dtype?). My desktop is running Python 2.7.6-8ubuntu0.2 and numpy 1:1.8.2-0ubuntu0.1 on Ubuntu 14.04.3 (identical software configuration to skynet). I have not tried reproducing the leak within the data vault code.
— Reply to this email directly or view it on GitHub https://github.com/martinisgroup/servers/issues/198#issuecomment-136513815 .
So I did some tests on matrix-reloaded and was able to reproduce the issue from the numpy bug. Then I tried the following two functions to load the array.txt
file created as in that issue (I converted it to floats since that's what we use in the datavault, but it doesn't make much difference either way):
def load1(fname):
with open(fname) as f:
return np.vstack([float(n) for n in line.split(' ')]
for line in f.xreadlines())
def load2(fname):
with open(mat) as f:
return np.vstack(np.array([float(n) for n in line.split(' ')])
for line in f.xreadlines())
Here are the results of some interactive sessions, with results from top in the comments:
# PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
# 22472 maffoo 20 0 190472 25600 6152 S 0.0 0.0 0:27.27 ipython
In [1]: a = load1('array.txt')
# 22472 maffoo 20 0 581120 416248 6152 S 0.0 0.2 0:36.60 ipython
In [2]: del a
# 22472 maffoo 20 0 190492 25620 6152 S 0.0 0.0 0:36.60 ipython
In [3]: a = load2('array.txt')
# 22472 maffoo 20 0 971736 806860 6152 S 0.0 0.3 0:45.65 ipython
In [4]: del a
# 22472 maffoo 20 0 581108 416232 6152 S 0.0 0.2 0:45.65 ipython
In [5]: import gc; gc.collect()
Out[5]: 0
# 22472 maffoo 20 0 190492 25620 6152 S 0.0 0.0 0:45.70 ipython
There's obviously a lot of extra garbage created by load2
, presumably due to the temporary numpy arrays created for each row. However, in neither case does there appear to be a leak, as forcing a gc.collect()
gets us back to the original memory usage. I will modify the code in the u/maffoo/fix-datavault-leak
branch to avoid creating these temporary arrays.
Closing since this particular leak seems to be fixed.