MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.32k stars 652 forks source link

files handles aren't closed if universe construction fails. #875

Open kain88-de opened 8 years ago

kain88-de commented 8 years ago

Expected behaviour

File handles are closed if universe construction failed.

Actual behaviour

File handles are still open.

Code to reproduce the behaviour

import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, XTC
import os
import psutil

p = psutil.Process(os.getpid())
# will throw an exception
u = mda.Universe(PSF, XTC)

# still contains the XTC as open
for f in p.open_files(): print f.path

Currently version of MDAnalysis:

(run python -c "import MDAnalysis as mda; print(mda.__version__)")

0.15.0

kain88-de commented 8 years ago

This should be fixed AFTER #815 or we might accidentally reintroduce the bug

jbarnoud commented 8 years ago

For the record, the discussion started in #853 and #874.

kain88-de commented 7 years ago

Btw this isn't fixed in the current develop branch.

kain88-de commented 7 years ago

Turns out this wasn't solved with the new topology system. The problem is just less apparent.

kain88-de commented 7 years ago

See https://travis-ci.org/MDAnalysis/mdanalysis/jobs/190615341#L1691

jbarnoud commented 7 years ago

Strangely enough, the open files are consistently the same 267 file handler. The number and the files are the same on travis and on my computer.

orbeckst commented 7 years ago

With 0.16.2-dev I tried running gc.collect() and the open file disappeared:

In [1]: import MDAnalysis as mda
   ...: from MDAnalysisTests.datafiles import PSF, XTC
   ...: import os
   ...: import psutil
   ...:
   ...: p = psutil.Process(os.getpid())
   ...:

In [2]: p
Out[2]: <psutil.Process(pid=16963, name='python') at 4548696208>

In [3]: p.open_files()
Out[3]:
[popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=4),
 popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=5)]

In [4]: u = mda.Universe(PSF, XTC)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-f147e4edeffd> in <module>()
----> 1 u = mda.Universe(PSF, XTC)

/Users/oliver/.virtualenvs/mdanalysis/develop/lib/python2.7/site-packages/MDAnalysis/core/universe.pyc in __init__(self, *args, **kwargs)
    276             else:
    277                 coordinatefile = args[1:]
--> 278             self.load_new(coordinatefile, **kwargs)
    279
    280         # Check for guess_bonds

/Users/oliver/.virtualenvs/mdanalysis/develop/lib/python2.7/site-packages/MDAnalysis/core/universe.pyc in load_new(self, filename, format, in_memory, **kwargs)
    433                                  top_n_atoms=len(self.atoms),
    434                                  fname=filename,
--> 435                                  trj_n_atoms=self.trajectory.n_atoms))
    436
    437         if in_memory:

ValueError: The topology and XTC trajectory files don't have the same number of atoms!
Topology number of atoms 3341
Trajectory: /Users/oliver/.virtualenvs/mdanalysis/develop/lib/python2.7/site-packages/MDAnalysisTests/data/adk_oplsaa.xtc Number of atoms 47681

In [5]: p.open_files()
Out[5]:
[popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=4),
 popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=5),
 popenfile(path='/Users/oliver/.virtualenvs/mdanalysis/develop/lib/python2.7/site-packages/MDAnalysisTests/data/adk_oplsaa.xtc', fd=11)]

In [6]: u
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-e85dde330c34> in <module>()
----> 1 u

NameError: name 'u' is not defined

In [10]: import gc

In [11]: gc.collect()
Out[11]: 459

In [12]: p.open_files()
Out[12]:
[popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=4),
 popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=5)]

Does this mean that this is not really a bug anymore?

kain88-de commented 7 years ago

We might be better now. I also have a feeling that the way we use nosetests may cause memory leaks. I noticed that when @tylerjereddy wrote the first tests for libdcd we did leak filehandles. After I switched all tests to use the context manager the leaks vanished. This was even though he did use explicit calls to delete in the tests.

jbarnoud commented 7 years ago

On 06/14/2017 09:46 AM, Max Linke wrote:

We might be better now. I also have a feeling that the way we use nosetests may cause memory leaks. I noticed that when @tylerjereddy https://github.com/tylerjereddy wrote the first tests for libdcd we did leak filehandles. After I switched all tests to use the context manager the leaks vanished. This was even though he did use explicit calls to delete in the tests.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/875#issuecomment-308348796, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUWumiG9xYi2vw4kKSOnzsHRH4crVJ2ks5sD4-_gaJpZM4Iz0_E.

When I was investigating the question some time ago, I noticed quite a lot of object kept alive by nose, indeed. Nose creates an object per test, in some cases the object was keeping references to atom groups or similar objects that were themselves keeping references to universes, even when we deleted the universe in the tear down.

orbeckst commented 6 years ago

I tested with develop (pre 0.19.0) and the error still persists (on macOS, Python 3.5):

In [1]: import MDAnalysis as mda
   ...: from MDAnalysisTests.datafiles import PSF, XTC
   ...: import os
   ...: import psutil
   ...:

In [2]: p = psutil.Process(os.getpid())
   ...: # will throw an exception
   ...: u = mda.Universe(PSF, XTC)
   ...:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-aade24c0e039> in <module>()
      1 p = psutil.Process(os.getpid())
      2 # will throw an exception
----> 3 u = mda.Universe(PSF, XTC)

~/anaconda3/lib/python3.5/site-packages/MDAnalysis/core/universe.py in __init__(self, *args, **kwargs)
    332                 coordinatefile = None
    333
--> 334             self.load_new(coordinatefile, **kwargs)
    335             # parse transformations
    336             trans_arg = kwargs.pop('transformations', None)

~/anaconda3/lib/python3.5/site-packages/MDAnalysis/core/universe.py in load_new(self, filename, format, in_memory, **kwargs)
    594                                  top_n_atoms=len(self.atoms),
    595                                  fname=filename,
--> 596                                  trj_n_atoms=self.trajectory.n_atoms))
    597
    598         if in_memory:

ValueError: The topology and XTC trajectory files don't have the same number of atoms!
Topology number of atoms 3341
Trajectory: /Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/adk_oplsaa.xtc Number of atoms 47681

In [4]: for f in p.open_files(): print(f.path)
/Users/oliver/.ipython/profile_default/history.sqlite
/Users/oliver/.ipython/profile_default/history.sqlite
/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/adk_oplsaa.xtc

In [5]: import gc

In [6]: gc.collect()
Out[6]: 2732

In [7]: for f in p.open_files(): print(f.path)
/Users/oliver/.ipython/profile_default/history.sqlite
/Users/oliver/.ipython/profile_default/history.sqlite
/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/adk_oplsaa.xtc

Running gc.collect() did not clean up the open trajectory.

I also tried Universe(TPR, DCD) and Universe(TPR, XYZ) and this leaves the dcd and xyz file open, so it's likely a generic problem.

Somewhat confusingly, the open XTC file disappeared and the fd=12 of the XTC file was now being reused for the XYZ:

>>> p.open_files()
[popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=5),
 popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=6),
 popenfile(path='/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/adk_oplsaa.xtc', fd=12),
 popenfile(path='/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/adk_dims.dcd', fd=13)]

>>> u = mda.Universe(TPR, XYZ)
# fails ...

>>> p.open_files()
popenfile(path='/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/2r9r-1b.xyz', fd=12),
 popenfile(path='/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/adk_dims.dcd', fd=13)]

Then running gc.collect() cleaned up even further

In [19]: gc.collect()
Out[19]: 722

In [20]: p.open_files()
Out[20]:
[popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=5),
 popenfile(path='/Users/oliver/.ipython/profile_default/history.sqlite', fd=6),
 popenfile(path='/Users/oliver/anaconda3/lib/python3.5/site-packages/MDAnalysisTests/data/2r9r-1b.xyz', fd=12)]

I assume that there are some issues for how we close readers (maybe the __del__ method) and perhaps some circular references that make it difficult for the Reader to be properly disposed of.