Closed zonca closed 4 years ago
possibly having a conda
environment?
ideally if we have a full conda environment which also has the jupyterhub
and jupyterlab
packages, we wouldn't even need a Python enviroment on the docker container for Jetstream, we could completely run off CVMFS.
That has some appeal, especially if it could make jupyterhub deployment easier for sites that already have CVMFS installed.
In terms of the analysis tools, @bloer is working on putting the "python-only" part of them into CVMFS. I'll let him comment on whether or not he's using a conda environment for that.
I have to admit I've never made a conda package before, but I'm a huge fan, it's the only thing I use for python.
The python environment will be available in the next CVMFS release. lots of python packages are already distributed by the central LCG CVMFS repo, including most of the scientific packages that conda provides, so I think conda is an unnecessary step if we're continuing to base the image on CVMFS.
For things not provided centrally, or for users to get the latest packages faster than new CVMFS releases are published, I think we can provide instructions for users to use pip with the --user
flag
ok, thanks @bloer so once we get the new release we will resume testing.
if it includes jupyterhub
and jupyterlab
, better, otherwise I will find a workaround.
Please update this issue when the new release is available.
Not sure if this is the right place for this... I'm using pandas, and get errors like
ImportError: Missing optional dependency 'tables'. Use pip or conda to install tables.
with pandas.read_hdf... @zonca How do we get an enhanced version of pandas?
@zonca to give some context, there is now a CVMFS release V02-01-04 that contains some of the python modules. @ziqinghong is one of the lucky folks testing to see how broken it is.
@ziqinghong in the short term, the easiest thing to do is install a package with user mode:
python -m pip install --user tables
Medium-term, since this isn't an XSEDE-specific issue, please add issues encountered while testing this release to the JIRA bug tracker https://jira.slac.stanford.edu/browse/CDMSGREL-32
Long-term I worry about the different paces of release cycles and analysis code development. We'll probably have to provide some tools to make it easy for everyone to work with bleeding edge CDMS packages or 3rd party packages that aren't provided with the release
Thanks Ben. Will report missing package in jira.
The slac jupyterlab environment has been stable and plenty good for more than 6 months. There might be packages updates that are transparent to me, but Amy and her team did beat it into shape pretty rapidly, and once that happened, there are very rare cases that we requested an upgrade.
@bloer how do we access the python modules on CVMFS?
@ziqinghong what python environment are you using?
Errr. I just spawned a jupyter instance and assumed you guys set up the environment well.... How do I check?
the jupyter environement is not currently using the python environment from CVMFS, so please do not report the issue on JIRA
once I get pointers from @bloer on how to set it up, I'll modify the Jupyter environment to use that. so we can get all the python modules from CVMFS also on Jupyter immediately after release.
@zonca Hmm OK maybe I'm confusing multiple issues. I thought you were already loading the CVMFS environment. Just do
source /cvmfs/cdms.opensciencegrid.org/setup_cdms.sh V02-01-04
@zonca Hmm OK maybe I'm confusing multiple issues. I thought you were already loading the CVMFS environment. Just do
source /cvmfs/cdms.opensciencegrid.org/setup_cdms.sh V02-01-04
I need details about the python environment so I can make it compatible with the rest, do you have any documentation? what version of python is it using? is it a full enviroment or just a modules folder?
The only documentation I kind find is this extremely useless page: http://lcginfo.cern.ch/pkg/Python/
Right now it's using python3.6. I hope to move to 3.7 in a near future release.
The environment's pretty messy. There is a full installation (site.getsitepackages()
yields /cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-slc6-gcc8-opt/lib/python3.6/site-packages), and also a handful of paths added to PYTHONPATH
by CVMFS. I don't know if the CVMFS environment would function if PYTHONPATH
were cleared.
The CDMS-specific packages are provided via PYTHONPATH
as well, but of course we have more control over that and could change it if necessary (maybe use .pth files instead?)
ok, @bloer, I am testing it, it looks like there is an issue with scdmsPyTools.BatTools.IO
:
IPython 7.5.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import sys
In [2]: sys.path
Out[2]:
['/cvmfs/sft.cern.ch/lcg/releases/ipython/7.5.0-c6a48/x86_64-centos7-gcc8-opt/bin',
'',
'/cvmfs/cdms.opensciencegrid.org/releases/centos7/V02-01-04/lib/python3.6/site-packages',
'/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.00-885ca/x86_64-centos7-gcc8-opt/lib',
'/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib',
'/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages',
'/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python36.zip',
'/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6',
'/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/lib-dynload',
'/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages',
'/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/lib/python3.6/site-packages/IPython/extensions',
'/home/jovyan/.ipython']
In [3]: from scdmsPyTools.BatTools.IO import *
...:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-3-39d806e33aaa> in <module>
----> 1 from scdmsPyTools.BatTools.IO import *
/cvmfs/cdms.opensciencegrid.org/releases/centos7/V02-01-04/lib/python3.6/site-packages/scdmsPyTools/BatTools/__init__.py in <module>
1 import os,sys
2 sys.path.append(os.path.dirname(os.path.realpath(__file__)))
----> 3 from rawdata_reader import *
4 from rawdata_writer import *
ModuleNotFoundError: No module named 'rawdata_reader'
yeah, I should have some release notes. BatTools isn't included because there's a problem with missing boost libraries that I haven't figured out how to solve yet. BatTools is also being deprecated in the near future to separate out that functionality from the rest of scdmsPyTools
from scdmsPyTools.TES.Templates import * is a good test for what we're using in that package.
ok, when I try to run Jupyterlab or the QT console with this environment the kernel keeps dying. I am debugging this issue.
@ziqinghong in the meantime, I think you can test opening a terminal in the JupyterHub environment, then loading the CVMFS environment:
source /cvmfs/cdms.opensciencegrid.org/setup_cdms.sh V02-01-04
and use the console version of ipython
.
bash-4.2$ which ipython
/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/bin/ipython
bash-4.2$ ipython
In [1]: from scdmsPyTools.TES.Templates import *
works fine there.
Currently I can install a kernel from CVMFS doing (to be automated for all users later on):
source /cvmfs/cdms.opensciencegrid.org/setup_cdms.sh V02-01-04
python -m ipykernel install --user --name cdms_V02-01-04 --display-name "CDMS V02-01-04"
So I have a kernel for CDMS available:
However, this kernel doesn't work.
@bloer I got the error out of JupyterHub:
{"log":"/cvmfs/sft.cern.ch/lcg/views/LCG_96python3/x86_64-centos7-gcc8-opt/bin/python: No module named ipykernel_launcher\n","stream"
it looks like ipykernel
is broken.
I was able to use the same environment to launch a jupyter-lab at centos7.slac.stanford.edu, and get a notebook running.
@ziqinghong can you connect to a notebook and execute code?
Yup, at least I could import scdmsPyTools...
I had to give jupyter lab the "--core-mode" switch. . I'll add ipykernel to the list to fix for next release. Is there any way to install that so we can see what else is broken? Pushing out new releases is not fast.
Sent from Outlook Mobilehttps://aka.ms/blhgte
From: ziqinghong notifications@github.com Sent: Tuesday, April 7, 2020 7:06:24 PM To: det-lab/jupyterhub-deploy-kubernetes-jetstream jupyterhub-deploy-kubernetes-jetstream@noreply.github.com Cc: Loer, Ben M ben.loer@pnnl.gov; Mention mention@noreply.github.com Subject: Re: [det-lab/jupyterhub-deploy-kubernetes-jetstream] CDMS Python environment (#12)
Yup, at least I could import scdmsPyTools...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://protect2.fireeye.com/v1/url?k=71f36725-2d46589c-71f34d30-0cc47adc5fce-2a08d4956b8d8188&q=1&e=586ff6b5-91dc-403a-9d53-b3350d77e54c&u=https%3A%2F%2Fgithub.com%2Fdet-lab%2Fjupyterhub-deploy-kubernetes-jetstream%2Fissues%2F12%23issuecomment-610710908, or unsubscribehttps://protect2.fireeye.com/v1/url?k=3623a2de-6a969d67-362388cb-0cc47adc5fce-88de40519c064f6e&q=1&e=586ff6b5-91dc-403a-9d53-b3350d77e54c&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABBBSZCZDBTHSBCWVRTTSJ3RLPL2BANCNFSM4LX2UFMA.
@bloer I started installing ipykernel
in user space, then it requested a lot of its requirements:
ipython, ipython_genutils, traitlets, jupyter, ptyprocess, prompt_toolkit, jupyter_client, jupyter_core, pyzmq
not sure why I needed to reinstall all those packages even if some were available in the base environment. Anyway, after all this it was working fine.
It would be useful to also install virtualenv
in the base system. So we can use that instead of messing with PYTHONPATH.
Since python 3.5(?) the recommended virtual environment is the built in venv. (python -m venv)
Sent from Outlook Mobilehttps://aka.ms/blhgte
From: Andrea Zonca notifications@github.com Sent: Tuesday, April 7, 2020 10:14:57 PM To: det-lab/jupyterhub-deploy-kubernetes-jetstream jupyterhub-deploy-kubernetes-jetstream@noreply.github.com Cc: Loer, Ben M ben.loer@pnnl.gov; Mention mention@noreply.github.com Subject: Re: [det-lab/jupyterhub-deploy-kubernetes-jetstream] CDMS Python environment (#12)
@bloerhttps://protect2.fireeye.com/v1/url?k=3640a937-6af597f8-36408322-0cc47adc5e60-940908249d58f091&q=1&e=9281700b-f2ca-40c5-99f4-9b976aecbbf7&u=https%3A%2F%2Fgithub.com%2Fbloer I started installing ipykernel in user space, then it requested a lot of its requirements:
ipython, ipython_genutils, traitlets, jupyter, ptyprocess, prompt_toolkit, jupyter_client, jupyter_core, pyzmq
not sure why I needed to reinstall all those packages even if some were available in the base environment. Anyway, after all this it was working fine.
It would be useful to also install virtualenv in the base system. So we can use that instead of messing with PYTHONPATH.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://protect2.fireeye.com/v1/url?k=8dc1f2ec-d174cc23-8dc1d8f9-0cc47adc5e60-0015be7764bc94c8&q=1&e=9281700b-f2ca-40c5-99f4-9b976aecbbf7&u=https%3A%2F%2Fgithub.com%2Fdet-lab%2Fjupyterhub-deploy-kubernetes-jetstream%2Fissues%2F12%23issuecomment-610755565, or unsubscribehttps://protect2.fireeye.com/v1/url?k=b9ff9648-e54aa887-b9ffbc5d-0cc47adc5e60-694439fdc0215992&q=1&e=9281700b-f2ca-40c5-99f4-9b976aecbbf7&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABBBSZFHZALTCB7T4LRBXXTRLQB5DANCNFSM4LX2UFMA.
Thanks I'll try that, I generally only use conda
On Wed, Apr 8, 2020, 08:04 Ben Loer notifications@github.com wrote:
Since python 3.5(?) the recommended virtual environment is the built in venv. (python -m venv)
Sent from Outlook Mobilehttps://aka.ms/blhgte
From: Andrea Zonca notifications@github.com Sent: Tuesday, April 7, 2020 10:14:57 PM To: det-lab/jupyterhub-deploy-kubernetes-jetstream < jupyterhub-deploy-kubernetes-jetstream@noreply.github.com> Cc: Loer, Ben M ben.loer@pnnl.gov; Mention mention@noreply.github.com Subject: Re: [det-lab/jupyterhub-deploy-kubernetes-jetstream] CDMS Python environment (#12)
@bloer< https://protect2.fireeye.com/v1/url?k=3640a937-6af597f8-36408322-0cc47adc5e60-940908249d58f091&q=1&e=9281700b-f2ca-40c5-99f4-9b976aecbbf7&u=https%3A%2F%2Fgithub.com%2Fbloer> I started installing ipykernel in user space, then it requested a lot of its requirements:
ipython, ipython_genutils, traitlets, jupyter, ptyprocess, prompt_toolkit, jupyter_client, jupyter_core, pyzmq
not sure why I needed to reinstall all those packages even if some were available in the base environment. Anyway, after all this it was working fine.
It would be useful to also install virtualenv in the base system. So we can use that instead of messing with PYTHONPATH.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://protect2.fireeye.com/v1/url?k=8dc1f2ec-d174cc23-8dc1d8f9-0cc47adc5e60-0015be7764bc94c8&q=1&e=9281700b-f2ca-40c5-99f4-9b976aecbbf7&u=https%3A%2F%2Fgithub.com%2Fdet-lab%2Fjupyterhub-deploy-kubernetes-jetstream%2Fissues%2F12%23issuecomment-610755565>, or unsubscribe< https://protect2.fireeye.com/v1/url?k=b9ff9648-e54aa887-b9ffbc5d-0cc47adc5e60-694439fdc0215992&q=1&e=9281700b-f2ca-40c5-99f4-9b976aecbbf7&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABBBSZFHZALTCB7T4LRBXXTRLQB5DANCNFSM4LX2UFMA
.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/det-lab/jupyterhub-deploy-kubernetes-jetstream/issues/12#issuecomment-611011452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC5Q4VTH6YOHI74DBXGJ3LRLSG37ANCNFSM4LX2UFMA .
Also beware: I at least only learned recently that virtual environments don't play nicely with PYTHONPATH. When PYTHONPATH is set it gets added to sys.path
before the virtual environment, so you have to go to extra lengths to install newer packages in the venv. PYTHONPATH also preempts user site paths.
@bloer same with venv
,
I needed to install:
pip install --upgrade --ignore-installed ipykernel ipython traitlets jupyter_client jupyter six ipython_genutils ptyprocess pyzmq prompt_toolkit
to get the kernel working, and after that still a lot of issues with PYTHONPATH, it was easier just using the pip install --user
.
Anyway, I think for the next release it would be useful if you can test that you can register a kernel:
source /cvmfs/cdms.opensciencegrid.org/setup_cdms.sh V02-01-04
python -m ipykernel install --user --name cdms_V02-01-04 --display-name "CDMS V02-01-04"
and then use it in Jupyterlab
@zonca, I believe @bloer has registered the kernel in the new release. The new command (which I have not yet tested on XSEDE) is
/cvmfs/cdms.opensciencegrid.org/setup_cdms.sh -K V02-03-01 --user
We can test success with
import cdms
cdms.get_global_version()
ok, I tested and it works. Can @pibion or @bloer please explain what -K
and --user
do?
Next I'll try to deploy this on the notebook environment.
There's a bit of description with /cvmfs/cdms.opensciencegrid.org/setup_cdms.sh -h
. The -K
switch tells the script to install a kernel. Any arguments following the version number are passed to jupyter kernelspec install
, so you could install to some other central location.
great job @bloer! The new release works great.
Exactly as @pibion suggested, open a terminal on the JupyterHub deployment, type:
/cvmfs/cdms.opensciencegrid.org/setup_cdms.sh -K V02-03-01 --user
then JupyterHub works with the kernel out of CVMFS (run change kernel
from the menu), see:
@pibion @ziqinghong I think you can start to play with the environment and check more deeply if anything is broken.
In the meantime, I'll think what is the best way to automate this so that new users do not have to install the kernels.
see https://github.com/zonca/docker-jupyter-cdms-light/pull/1
I wasn't able to run it automatically, but created a script that installs all the kernels automatically.
so users can just open a terminal and run install_cdms_kernels
ok, the Python environment works fine. Please open a new issue if anything stops working.
Originally posted by @pibion in https://github.com/det-lab/jupyterhub-deploy-kubernetes-jetstream/issues/8#issuecomment-606782144
I would like some details about the Python packages for CDMS analysis.