markovmodel / PyEMMA

🚂 Python API for Emma's Markov Model Algorithms 🚂
http://pyemma.org
GNU Lesser General Public License v3.0
311 stars 118 forks source link

Using n_jobs>1 in cktest() does not work on (my) OSX #1310

Closed cwehmeyer closed 6 years ago

cwehmeyer commented 6 years ago

I'm using pyemma-2.5.2 on macOS HighSierra (10.13.4) and tried to run a cktest() for a bayesian HMM. With the default parameter n_jobs=1, everything works fine. But with n_jobs=2, an OSError: [Errno 24] Too many open files is thrown (see Traceback for details).

Traceback ```Python traceback --------------------------------------------------------------------------- OSError Traceback (most recent call last) in () ----> 1 pyemma.plots.plot_cktest(hmm_6.cktest(mlags=2, n_jobs=2)); ~/Library/miniconda3/lib/python3.6/site-packages/pyemma/msm/estimators/maximum_likelihood_hmsm.py in cktest(self, mlags, conf, err_est, n_jobs, show_progress) 670 mlags=mlags, conf=conf, err_est=err_est, 671 n_jobs=n_jobs, show_progress=show_progress) --> 672 ck.estimate(self._dtrajs_full) 673 return ck ~/Library/miniconda3/lib/python3.6/site-packages/pyemma/_base/estimator.py in estimate(self, X, **params) 410 if params: 411 self.set_params(**params) --> 412 self._model = self._estimate(X) 413 # ensure _estimate returned something 414 assert self._model is not None ~/Library/miniconda3/lib/python3.6/site-packages/pyemma/msm/estimators/lagged_model_validators.py in _estimate(self, data) 142 estimated_models, estimators = \ 143 estimate_param_scan(self.test_estimator, data, pargrid, return_estimators=True, failfast=False, --> 144 progress_reporter=progress_reporter, n_jobs=self.n_jobs) 145 if include0: 146 estimated_models = [None] + estimated_models ~/Library/miniconda3/lib/python3.6/site-packages/pyemma/_base/estimator.py in estimate_param_scan(estimator, X, param_sets, evaluate, evaluate_args, failfast, return_estimators, n_jobs, progress_reporter, show_progress, return_exceptions) 317 318 from pathos.multiprocessing import Pool as Parallel --> 319 pool = Parallel(processes=n_jobs) 320 args = list(task_iter) 321 if show_progress: ~/Library/miniconda3/lib/python3.6/site-packages/multiprocess/pool.py in __init__(self, processes, initializer, initargs, maxtasksperchild, context) 148 maxtasksperchild=None, context=None): 149 self._ctx = context or get_context() --> 150 self._setup_queues() 151 self._taskqueue = queue.Queue() 152 self._cache = {} ~/Library/miniconda3/lib/python3.6/site-packages/multiprocess/pool.py in _setup_queues(self) 241 242 def _setup_queues(self): --> 243 self._inqueue = self._ctx.SimpleQueue() 244 self._outqueue = self._ctx.SimpleQueue() 245 self._quick_put = self._inqueue._writer.send ~/Library/miniconda3/lib/python3.6/site-packages/multiprocess/context.py in SimpleQueue(self) 109 '''Returns a queue object''' 110 from .queues import SimpleQueue --> 111 return SimpleQueue(ctx=self.get_context()) 112 113 def Pool(self, processes=None, initializer=None, initargs=(), ~/Library/miniconda3/lib/python3.6/site-packages/multiprocess/queues.py in __init__(self, ctx) 324 325 def __init__(self, *, ctx): --> 326 self._reader, self._writer = connection.Pipe(duplex=False) 327 self._rlock = ctx.Lock() 328 self._poll = self._reader.poll ~/Library/miniconda3/lib/python3.6/site-packages/multiprocess/connection.py in Pipe(duplex) 513 c2 = Connection(s2.detach()) 514 else: --> 515 fd1, fd2 = os.pipe() 516 c1 = Connection(fd1, writable=False) 517 c2 = Connection(fd2, readable=False) OSError: [Errno 24] Too many open files ```
My conda environment ``` # Name Version Build Channel absl-py 0.1.10 py_0 conda-forge appnope 0.1.0 py36_0 conda-forge asn1crypto 0.24.0 py36_0 astor 0.6.2 py_0 conda-forge backcall 0.1.0 py_0 conda-forge bhmm 0.6.2 py36_1 conda-forge bleach 2.1.3 py_0 conda-forge blosc 1.14.0 1 conda-forge bzip2 1.0.6 1 conda-forge ca-certificates 2018.4.16 0 conda-forge certifi 2018.4.16 py36_0 conda-forge cffi 1.11.5 py36h342bebf_0 chardet 3.0.4 py36h96c241c_1 conda 4.5.4 py36_0 conda-forge conda-env 2.6.0 h36134e3_0 cryptography 2.2.2 py36h1de35cc_0 cycler 0.10.0 py36_0 conda-forge cython 0.28.2 py36_0 conda-forge decorator 4.3.0 py_0 conda-forge deeptime 0.1.4.dev15+g1b260f1 dill 0.2.7.1 py36_0 conda-forge entrypoints 0.2.3 py36_1 conda-forge freetype 2.8.1 0 conda-forge gast 0.2.0 py_0 conda-forge grpcio 1.11.0 py36hd9629dc_0 h5py 2.8.0 py36h470a237_0 conda-forge hdf5 1.10.1 2 conda-forge html5lib 1.0.1 py_0 conda-forge humanfriendly 4.12.1 py_0 conda-forge icu 58.2 0 conda-forge idna 2.6 py36h8628d0a_1 intel-openmp 2018.0.0 8 ipykernel 4.8.2 py36_0 conda-forge ipython 6.4.0 py36_0 conda-forge ipython_genutils 0.2.0 py36_0 conda-forge ipywidgets 7.2.1 py36_1 conda-forge jedi 0.12.0 py36_0 conda-forge jinja2 2.10 py36_0 conda-forge jpeg 9b 2 conda-forge jsonschema 2.6.0 py36_1 conda-forge jupyter 1.0.0 py_1 conda-forge jupyter_client 5.2.3 py36_0 conda-forge jupyter_console 5.2.0 py36_0 conda-forge jupyter_contrib_core 0.3.3 py36_1 conda-forge jupyter_contrib_nbextensions 0.5.0 py36_0 conda-forge jupyter_core 4.4.0 py_0 conda-forge jupyter_highlight_selected_word 0.2.0 py36_0 conda-forge jupyter_latex_envs 1.4.4 py36_0 conda-forge jupyter_nbextensions_configurator 0.4.0 py36_0 conda-forge kiwisolver 1.0.1 py36_1 conda-forge libcxx 4.0.1 h579ed51_0 libcxxabi 4.0.1 hebd6815_0 libedit 3.1 hb4e282d_0 libffi 3.2.1 h475c297_4 libgfortran 3.0.1 h93005f0_2 libiconv 1.15 0 conda-forge libpng 1.6.34 0 conda-forge libprotobuf 3.5.2 0 conda-forge libsodium 1.0.16 0 conda-forge libtiff 4.0.9 0 conda-forge libxml2 2.9.8 0 conda-forge libxslt 1.1.32 0 conda-forge llvmlite 0.23.0 py36_1 conda-forge lxml 4.2.1 py36_0 conda-forge markdown 2.6.11 py_0 conda-forge markupsafe 1.0 py36_0 conda-forge matplotlib 2.2.2 py36_1 conda-forge mdshare 0.3.2 py36_0 conda-forge mdtraj 1.9.1 py36_1 conda-forge mistune 0.8.3 py36_1 conda-forge mkl 2018.0.2 1 mkl_fft 1.0.2 py36_0 conda-forge mkl_random 1.0.1 py36_0 conda-forge mock 2.0.0 py36_0 conda-forge msmtools 1.2.1 py36_1 conda-forge multiprocess 0.70.5 py36_0 conda-forge nbconvert 5.3.1 py_1 conda-forge nbformat 4.4.0 py36_0 conda-forge ncurses 6.0 hd04f020_2 nglview 1.1.3 py_1 conda-forge ninja 1.8.2 h2d50403_1 conda-forge notebook 5.5.0 py36_0 conda-forge numba 0.38.0 py36_0 conda-forge numexpr 2.6.5 py36_0 conda-forge numpy 1.13.3 py36ha9ae307_4 olefile 0.45.1 py36_0 conda-forge openssl 1.0.2o 0 conda-forge pandas 0.23.0 py36_1 conda-forge pandoc 2.2.1 hde52d81_0 conda-forge pandocfilters 1.4.2 py36_0 conda-forge parso 0.2.1 py_0 conda-forge pathos 0.2.1 py36_1 conda-forge patsy 0.5.0 py36_0 conda-forge pbr 4.0.3 py_0 conda-forge pexpect 4.5.0 py36_0 conda-forge pickleshare 0.7.4 py36_0 conda-forge pillow 5.1.0 py36_0 conda-forge pip 9.0.3 py36_0 conda-forge pox 0.2.3 py36_0 conda-forge ppft 1.6.4.7.1 py36_0 conda-forge prompt_toolkit 1.0.15 py36_0 conda-forge protobuf 3.5.2 py36_0 conda-forge psutil 5.4.5 py36_0 conda-forge ptyprocess 0.5.2 py36_0 conda-forge pycosat 0.6.3 py36hee92d8f_0 pycparser 2.18 py36h724b2fc_1 pyemma 2.5.2 py36_1 conda-forge pygments 2.2.0 py36_0 conda-forge pyopenssl 17.5.0 py36h51e4350_0 pyparsing 2.2.0 py36_0 conda-forge pyqt 5.6.0 py36_5 conda-forge pysocks 1.6.8 py36_0 pytables 3.4.3 py36_8 conda-forge python 3.6.5 hc167b69_0 python-dateutil 2.7.3 py_0 conda-forge python-dateutil 2.7.3 python.app 2 py36_8 pytorch 0.4.0 py36_cuda0.0_cudnn0.0_1 pytorch pytz 2018.4 py_0 conda-forge pyyaml 3.12 py36_1 conda-forge pyzmq 17.0.0 py36_4 conda-forge qt 5.6.2 h9e3eb04_4 conda-forge qtconsole 4.3.1 py36_0 conda-forge readline 7.0 hc1231fa_4 requests 2.18.4 py36h4516966_1 ruamel_yaml 0.15.35 py36h1de35cc_1 scipy 1.1.0 py36hcaad992_0 seaborn 0.8.1 py36_0 conda-forge send2trash 1.5.0 py_0 conda-forge setuptools 39.0.1 py36_0 simplegeneric 0.8.1 py36_0 conda-forge sip 4.18 py36_1 conda-forge six 1.11.0 py36h0e22d5e_1 sqlite 3.23.1 hf1716c9_0 statsmodels 0.9.0 py36_0 conda-forge tensorboard 1.8.0 tensorflow 1.8.0 termcolor 1.1.0 py36_1 conda-forge terminado 0.8.1 py36_0 conda-forge testpath 0.3.1 py36_0 conda-forge thermotools 0.2.6 py36_2 conda-forge tk 8.6.7 h35a86e2_3 torchvision 0.2.1 py36_1 pytorch tornado 5.0.2 py36_0 conda-forge tqdm 4.22.0 py_0 conda-forge traitlets 4.3.2 py36_0 conda-forge urllib3 1.22 py36h68b9469_0 wcwidth 0.1.7 py36_0 conda-forge webencodings 0.5.1 py36_0 conda-forge werkzeug 0.14.1 py_0 conda-forge wheel 0.31.0 py36_0 widgetsnbextension 3.2.1 py36_0 conda-forge xz 5.2.3 h727817e_4 yaml 0.1.7 hc338f04_2 zeromq 4.2.5 1 conda-forge zlib 1.2.11 hf3cbc9b_2 ```
marscher commented 6 years ago

I assume that you have already restarted the OS and have re-run this particular script? Otherwise we do not have any clue if this is really caused by running multiple jobs or not. There are only two obvious new file descriptors in this place of code, so it should not be too much IMHO.

Or before rebooting, check how many descriptors are already open before and after starting multiple processes.

https://stackoverflow.com/questions/20974438/get-list-of-open-files-descriptors-in-os-x

Maybe this is helpful as well (increases max fd): https://superuser.com/questions/302754/increase-the-maximum-number-of-open-file-descriptors-in-snow-leopard

cwehmeyer commented 6 years ago

Looks like I used too many resources on my machine. Now it works, thanks!

marscher commented 6 years ago

Thanks for the feedback!