recast-hep / recast-atlas

CLI for ATLAS RECAST contributors
https://recast.docs.cern.ch/
Apache License 2.0
5 stars 5 forks source link

LXPLUS example in docs not working #82

Closed matthewfeickert closed 2 years ago

matthewfeickert commented 2 years ago

In the current README there is the example for using RECAST on LXPLUS

https://github.com/recast-hep/recast-atlas/blob/4264157ad3d92aa145249f37a3a26a8dd257c27d/README.md#L37-L44

However, lxplus-cloud.cern.ch doesn't seem to be a valid address. I think(?) this might now be lxplus8.cern.ch given https://clouddocs.web.cern.ch/clients/lxplus.html (@lukasheinrich can you confirm?). However, if one logs onto lxplus8 then the following fails

[feickert@lxplus8s05 ~]$ readlink -f ~recast/public/setup.sh
/afs/cern.ch/user/r/recast/public/setup.sh
[feickert@lxplus8s05 ~]$ cat $(readlink -f ~recast/public/setup.sh)
export RECAST_DEFAULT_RUN_BACKEND=local
export RECAST_DEFAULT_BUILD_BACKEND=kubernetes
export PACKTIVITY_CONTAINER_RUNTIME=singularity
export SINGULARITY_CACHEDIR="/tmp/$(whoami)/singularity"
mkdir -p $SINGULARITY_CACHEDIR
# https://twitter.com/lukasheinrich_/status/1021398718996713475
# http://click.pocoo.org/5/python3/
export LC_ALL=en_US.utf-8
export LANG=en_US.utf-8
scl_source enable rh-python36
source ~recast/public/yadage/venv/bin/activate

$(recast catalogue add /eos/project/r/recast/atlas/catalogue)
export KUBECONFIG=/eos/project/r/recast/atlas/cluster/clusterconfig
export PATH=$PATH:~recast/public/bin
[feickert@lxplus8s05 ~]$ command -v scl_source  # no output, so scl_source not found!
[feickert@lxplus8s05 ~]$ . ~recast/public/setup.sh
-bash: scl_source: command not found
/afs/cern.ch/user/r/recast/public/yadage/venv/bin/python3: error while loading shared libraries: libpython3.6m.so.rh-python36-1.0: cannot open shared object file: No such file or directory
(venv) [feickert@lxplus8s05 ~]$ 

So the public RECAST setup script shouldn't rely on scl_source but still needs Python 3 to get its virtual environment setup.

matthewfeickert commented 2 years ago

Okay, so maybe the script needs to be updated as on lxplus8 there is a Python 3

[feickert@lxplus8s06 ~]$ python3 --version --version
Python 3.6.8 (default, Oct 19 2021, 05:14:06) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)]

but that's a different Python 3 than the version setup by scl_source enable rh-python36 so the following won't work with trying to use the existing ~recast/public/yadage/venv/ that was created with the rh-python36 Python

[feickert@lxplus8s06 ~]$ export RECAST_DEFAULT_RUN_BACKEND=local
[feickert@lxplus8s06 ~]$ export RECAST_DEFAULT_BUILD_BACKEND=kubernetes
[feickert@lxplus8s06 ~]$ export PACKTIVITY_CONTAINER_RUNTIME=singularity
[feickert@lxplus8s06 ~]$ export SINGULARITY_CACHEDIR="/tmp/$(whoami)/singularity"
[feickert@lxplus8s06 ~]$ mkdir -p $SINGULARITY_CACHEDIR
[feickert@lxplus8s06 ~]$ export LC_ALL=en_US.utf-8
[feickert@lxplus8s06 ~]$ export LANG=en_US.utf-8
[feickert@lxplus8s06 ~]$ . ~recast/public/yadage/venv/bin/activate
(venv) [feickert@lxplus8s06 ~]$ python --version
python: error while loading shared libraries: libpython3.6m.so.rh-python36-1.0: cannot open shared object file: No such file or directory

@lukasheinrich If you agree with my summary then I'll login as recast and delete the existing virtual environment and create a new one from a requirements.txt that we can place under ~recast/public/.

edit: I guess lxplus_reqs.txt is this requirements.txt already, but maybe we should turn this into a proper pip-tools lock file for application deployment. That's another Issue though.

matthewfeickert commented 2 years ago

While temporarily giving up on trying to setup a working environment with the graphviz C library built from source, I tried on lxplus8 to setup an LCG view, but I realized there are very limited LCG views that support CentOS 8 as LCG views are mostly CentOS 7.

However, there are a few LCG views that are CentOS 8 like dev3 with x86_64-centos8-gcc10-opt. So on lxplus8

$ . /cvmfs/sft.cern.ch/lcg/views/dev3/latest/x86_64-centos8-gcc10-opt/setup.sh
$ python --version --version
Python 3.9.6 (default, Sep  6 2021, 15:33:32) 
[GCC 10.3.0]

Though using LCG views is probably not the way to go as trying to setup anything in that virtual environment is a total mess and will give a huge array of pip warnings given all the other things already crammed in there (when trying to install things like pygraphviz).

@lukasheinrich If you have any memory of how you setup the original virtual environment that would be helpful, but I guess looking at it you didn't setup a visualization backend (i.e., no recast-atlas[local]==0.0.20)

[recast@lxplus8s17 yadage]$ readlink -f .
/afs/cern.ch/user/r/recast/public/yadage
[recast@lxplus8s17 yadage]$ ls -lhtra
total 7.0K
drwxr-xr-x. 6 recast zp     2.0K Jul 22  2018 venv
drwxr-xr-x. 3 recast zp     2.0K Feb  4  2019 .
-rw-r--r--. 1 recast zp       93 Nov 13  2019 requirements.txt
drwxr-xr-x. 7 recast def-cg 2.0K Dec 16 07:59 ..
[recast@lxplus8s17 yadage]$ cat requirements.txt 
recast-atlas==0.0.20
packtivity==0.14.20
yadage==0.19.9
yadage-schemas==0.10.6
adage==0.9.0

[recast@lxplus8s17 yadage]$ find venv/lib/python3.6/site-packages/ -type d -iname "pygraphviz"

So if this can be skipped this makes things easier. :+1:

matthewfeickert commented 2 years ago

However, lxplus-cloud.cern.ch doesn't seem to be a valid address. I think(?) this might now be lxplus8.cern.ch

Yeah. On the List of LXPLUS Aliases page there's the entry

DNS Alias Name Notes
lxplus8.cern.ch Set of machines configured with CS8. It includes the supported cloud clients thus replacing lxplus-cloud since 31 March 2021.