desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
36 stars 24 forks source link

Remove import of matplotlib when running core pipeline? #494

Open julienguy opened 6 years ago

julienguy commented 6 years ago

I am getting today the following matplotlib error when running desi_pipe_run_mpi --first redshift --last redshift. We should avoid this, because we do not need to produce plots in the pipeline run. Importing matplotlib in the pipeline is introducing an unnecessary fragility. We need it for QA, but this should be independent jobs.


Traceback (most recent call last):
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/font_manager.py", line 1429, in <module>
    fontManager = pickle_load(_fmcache)
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/font_manager.py", line 966, in pickle_load
    data = pickle.load(fh)
EOFError: Ran out of input

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/global/common/software/desi/users/jguy/desispec/bin/desi_pipe_run_mpi", line 33, in <module>
    import desispec.scripts.pipe_run as pipe_run
  File "/global/common/software/desi/users/jguy/desispec/py/desispec/scripts/pipe_run.py", line 23, in <module>
    import desispec.pipeline as pipe
  File "/global/common/software/desi/users/jguy/desispec/py/desispec/pipeline/__init__.py", line 19, in <module>
    from .plan import (select_nights, create_prod, load_prod)
  File "/global/common/software/desi/users/jguy/desispec/py/desispec/pipeline/plan.py", line 23, in <module>
    import healpy as hp
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_aux/lib/python3.5/site-packages/healpy-1.10.3-py3.5-linux-x86_64.egg/healpy/__init__.py", line 55, in <module>
    from .visufunc import (mollview,graticule,delgraticules,gnomview,
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_aux/lib/python3.5/site-packages/healpy-1.10.3-py3.5-linux-x86_64.egg/healpy/visufunc.py", line 55, in <module>
    from . import projaxes as PA
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_aux/lib/python3.5/site-packages/healpy-1.10.3-py3.5-linux-x86_64.egg/healpy/projaxes.py", line 24, in <module>
    import matplotlib.axes
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/axes/__init__.py", line 4, in <module>
    from ._subplots import *
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/axes/_subplots.py", line 10, in <module>
    from matplotlib.axes._axes import Axes
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/axes/_axes.py", line 23, in <module>
    import matplotlib.contour as mcontour
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/contour.py", line 22, in <module>
    import matplotlib.font_manager as font_manager
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/font_manager.py", line 1439, in <module>
    _rebuild()
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/font_manager.py", line 1421, in _rebuild
    with cbook.Locked(cachedir):
  File "/global/common/cori/contrib/desi/desiconda/20170613-1.1.4-spectro/code/desiconda/20170613-1.1.4-spectro_conda/lib/python3.5/site-packages/matplotlib/cbook.py", line 2746, in __enter__
    raise self.TimeoutError(err_str)
matplotlib.cbook.TimeoutError: LOCKERROR: matplotlib is trying to acquire the lock
    '/global/homes/j/jguy/.cache/matplotlib/.matplotlib_lock-*'
and has failed.  This maybe due to any other process holding this
lock.  If you are sure no other matplotlib process is running try
removing these folders and trying again.
sbailey commented 6 years ago

+1 for isolating matplotlib imports, but unfortunately this appears to be coming from healpy not our code.

We might be able to do something in our desiconda installs to pre-build the font cache so that user code doesn't end up having N>>1 parallel processes all trying to build it and running into each other. Just guessing there.

Another hack would be to have MPI rank 0 import matplotlib followed by an MPI barrier, before proceeding with importing healpy. Ugh, but I think that would at least isolate any font cache building stuff to a single process.

tskisner commented 6 years ago

For docker images, I do pre-build the font cache (and also the astropy config stuff). I could easily do that for regular desiconda installs too.