choderalab / msm-pipeline

A pipeline for MSMs.
GNU Lesser General Public License v3.0
2 stars 5 forks source link

Error with parallelization #42

Open sonyahanson opened 6 years ago

sonyahanson commented 6 years ago

Saw this happen for a run using standard submit script on hal. Successfuly ran a job just days before using the same script. (Not urgent.)

Running pipeline
Traceback (most recent call last):
  File "/cbio/jclab/home/hansons/opt/anaconda/bin/msm-pipeline", line 9, in <module>
    load_entry_point('msmpipeline==0.0.1', 'console_scripts', 'msm-pipeline')()
  File "/cbio/jclab/home/hansons/opt/anaconda/lib/python2.7/site-packages/msmpipeline-0.0.1-py2.7.egg/msmpipeline/msmpipeline.py", line 246, in main
    run_pipeline(fnames, project_name = options.project_name, n_clusters = options.n_clusters, hmm_iter = options.hmm_iter, feature_selection = options.fe
ature_selection)
  File "/cbio/jclab/home/hansons/opt/anaconda/lib/python2.7/site-packages/msmpipeline-0.0.1-py2.7.egg/msmpipeline/msmpipeline.py", line 81, in run_pipelin
e
    respairs_that_changed = find_respairs_that_changed(fnames, scheme=scheme)
  File "/cbio/jclab/home/hansons/opt/anaconda/lib/python2.7/site-packages/msmpipeline-0.0.1-py2.7.egg/msmpipeline/contact_features.py", line 69, in find_r
espairs_that_changed
    distances = pool.map(get_distances_, fnames)
  File "/cbio/jclab/home/hansons/opt/anaconda/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/cbio/jclab/home/hansons/opt/anaconda/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
tables.exceptions.NoSuchNodeError: group ``/`` does not have a child named ``coordinates``
jchodera commented 6 years ago

Can you move that code into a try...except block that (1) prints out the relevant HDF5 file, and (2) drops it from the dataset to continue processing?

Without (1) I can't debug the problem