LSSTDESC / desc-help

DESC Computing Requests
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

[NERSC] Reorganize and expand our DESC environments #61

Closed johannct closed 2 weeks ago

johannct commented 3 years ago

Description A clear and concise description of what the issue is.

Choose all applicable topics by placing an 'X' between the [ ]:

To Reproduce Steps to reproduce the behavior:

  1. Go to jupyter.nersc.gov
  2. Choose latest desc-python (3.8)
  3. execute
    import matplotlib as mpl
    mpl.rcParams['text.usetex'] = True
    from matplotlib import pylab as plt
    plt.plot([0,1],[0,1])
  4. See trace : RuntimeError: Failed to process string with tex because latex could not be found

Screenshots Screenshot from 2021-04-17 07-54-31

So the central question is : how do we deal with latex plotting from within a desc-python session? I presume that no on complained because everybody who needed it installed texlive locally? But it is 5GB...... on the other hand managing a latex custom setup within the desc-python environment promises to be a major burden..... People are going to use desc-python to build publication-ready figures, so they will certainly want latex capabilities.....

Issue initially discovered by @sschmidt23 while working at NERSC with qp, which has a plotting module setting up latex and serif in the rcParams.

johannct commented 3 years ago

Nota bene : this is entirely a question of container management, as NERSC does distribute latex :

johannct@cori06:~> which latex 
/usr/bin/latex
johannct@cori06:~> source /global/common/software/lsst/common/miniconda/setup_current_python.sh
Now using /usr/local/py/envs/desc/bin/python
(desc) bash-4.2$ which latex
which: no latex in (/usr/local/py/envs/desc/bin:/usr/local/py/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/udiImage/bin)
(desc) bash-4.2$ conda deactivate
bash-4.2$ which latex
which: no latex in (/usr/local/py/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/udiImage/bin)
bash-4.2$ exit
exit
johannct@cori06:~> which latex
/usr/bin/latex

So the issue is not about installing texlive globally for DESC people, but just about not shielding the host system entirely from the shifter image.... This is by the way a major change of behavior compared to the non-shifter previous setup, which was not shielding the host system if I trust what I see at CC, which did not move to the recent desc-python setup.

Also I am wondering why I need to exit from a useless bash shell after a conda deactivate in order to fall back into my normal login. Is that expected and unavoidable?

heather999 commented 3 years ago

HI Johann, I took some time to think about this and also investigated how NERSC currently handles latex in its own jupyter kernel. At NERSC, while they don't provide latex in their jupyter environment out of the box, they offer a work-around for users to install their own customized kernel to set up their locally installed python plus latex. We want to avoid forcing people to customize in this fashion I think. I did add texlive into desc-python:bleed to both demonstrate that it works and to see the effect on the image size (it increased by ~0.5 compressed GB, resulting in an image size of ~3 GB). That's not a show-stopper and could easily be offset by modifying the existing images. Now is this the right thing to do? We need to consider the use cases for our environments, as I feel the requirements for varies based on their purpose. Maybe there is not a one size fits all and we need multiple environments to support the use cases. I can imagine:

There are some things to point out

So my proposal would be to reorganize our environments (most of which are only in use at NERSC currently). desc-python-nersc - providing what we had before, stack-free, installed locally at NERSC, making the NERSC env fully available desc-python-img - Containerized, useful for CI, batch jobs, and those wishing to pull a pretty complete DESC env onto another site or laptop
desc-python-bleed - will continue as a containerized env, including most recent version of all releases plus master of GCRCatalogs desc-stack-nersc - likely using the CVMFS DM stack installations plus DESC requested packages, and has the local NERSC env fully available desc-stack-img - What is now called desc-stack, a containerized DM stack plus DESC requested packages

There are some other environments out there.. like desc-pyspark and another that's for desc-python-webapps which I'll ignore for now.

johannct commented 3 years ago

thanks Heather. We can try that. Is it manageable to guarantee that the common stuff is always identical for all of these? It may also be interesting to check that a custom matplotlib config can be distributed within each solution and be made available to users.

heather999 commented 3 years ago

Hi Johann, I think we need to guarantee that "common" stuff is identical when appropriate - so desc-python-nersc core package versions should be the same as what is part of desc-python-img.. some common yaml file that is used to set up both environments should take care of it. Concerning matplotlib config - by default, matplotlib is going to look in typical places under $HOME: https://matplotlib.org/stable/faq/troubleshooting_faq.html#matplotlib-configuration-and-cache-directory-locations and that can be over-ruled by the MPLCONFIGDIR env variable. So it's possible to have a universal config that experienced users could then decide to ignore by resetting that env variable themselves. There's some balance between making it convenient and offering more control by those who are experienced.

I'd like to rename this issue and start to pull in a few others to offer their comments before we start down this path of multiple environments. Easy enough to implement..just want to make sure we capture a good starting set of use cases and consider naming conventions.

heather999 commented 3 years ago

HI I want to ping @yymao and @JoanneBogart to help think this through. Specifically the idea of expanding our DESC environments to provide both native installations (basically going back to what we had before with desc-python) and shifter images which might serve additional purposes but may not be as full featured to cover every possible use case. See the middle-bottom of a previous comment.

JoanneBogart commented 3 years ago

My understanding was that maintaining native installations was a significant burden compared to maintaining images only, but if it's feasible to do both (and keep package versions the same across both forms of distribution where appropriate) I'm all for it. Speaking for myself, when I'm doing development I want access to productivity tools like emacs, pylint, etc. and might also need variant versions of things. I don't know of any good way to achieve this flexibility with images. But images are certainly the better choice in other circumstances.

yymao commented 3 years ago

My reaction is similar to @JoanneBogart -- more specifically, what were the main reasons for switching from local installation to container? Were those reasons no longer showstoppers?

johannct commented 3 years ago

I am absolutely not an expert in containerization and in all the things you can or can't do with them, so do whatever you feel like with my gut feelings : containers are great when the code is mature and is ready to run in production, because it is well defined, reproducible, safe and portable. But it adds a significant burden to developers, onboarding newcomers, and software admins when the software is in full development, without an obvious immediate gain around its advantages listed above. It is always a good practice to anticipate containerization by trying out periodically on the toolstack; at the current stage, it should not be our prime goal.

Finally, to answer @JoanneBogart 's point : the burden is significantly reduced when installing images rather than deploying a complex environment locally, no question. But 1/ we do not really face this issue today (very little is done at CC, nothing elsewhere) so the level of anticipation is too high at this point, and 2/ let's not overstate the complexity of our software at this stage: we are not compiling and linking many C+fortran+whatever codes that would cause mismatch with local infrastructures... so we do not have, AFAIK, an immediate and well defined problem. @heather999 cirrect me if I am wrong here. From what I could see our immediate problems have been consistently that people pip install --user, or set up their environment in bash initialization scripts, forgetting about it down the line and ending up with conflict at the first upgrades, etc.... What we need is a consistent and enforceable use of conda, and a global CI system.

yymao commented 3 years ago

Thanks @johannct but I still have the same question: when we moved to Python 3.8 we switched to containers for most commonly used DESC environments, what were the main reasons (even if they are historical ones)? While we could argue that we shouldn't have moved to containers, without knowing the reasons that we switched I don't see how to evaluate the different choices here.

johannct commented 3 years ago

I think it was precisely the idea of having a production level tool earlier than later, but Heather would know.

heather999 commented 3 years ago

Here is what I was thinking in a move to containers:

Here is what has changed in my thinking: Containers are great to compartmentalize an environment and share across sites, but there are some costs depending on the use case. Developers want editors, people making pretty pictures want latex, sometimes users need to submit jobs directly from within the containerized environment and that's handled differently depending on whether you're talking about singulariy, shifter, or docker, if it works at all. Some of these applications are already available at the host sites (NERSC, CC-IN2P3, etc). We could balloon our containers and try to accomodate all the possible use cases, or we step back and agree that containers have a place and setting up an environment natively on the host machine has its utility as well.

My thinking is that there are a common set of steps to create these environments within the containers and we can use a similar procedure to install at NERSC, similar to what is already done at CC-IN2P3 to reproduce desc-python. Currently, that part of updating the natively installed conda environment at NERSC is still too manual and not nimble but certainly we already know how to improve it. I'd like to move to a mode where we use the images for CI, processing, etc and continue to make them available for use at NERSC or anywhere... but there will also be a native installation where the typical NERSC applications, editors, etc are already available and people are able to use them easily.

github-actions[bot] commented 4 weeks ago

This issue is stale because it has been 90 days since last activity. If no further activities take place, this issue will be closed in 14 days.
Add label keep to keep this issue.