matthewfeickert / cvmfs-venv

Example implementation of getting a Python virtual environment to work with CVMFS LCG views
MIT License
9 stars 2 forks source link
cvmfs hep hep-ex lcg physics python python-venv python3

cvmfs-venv

DOI

Simple command line utility for getting a Python virtual environment to work with CVMFS LCG views. This is done by adding additional hooks to the Python virtual environment's bin/activate script.

Install

Either clone the repo to your directory or simply download the relevant files and place on PATH

$ mkdir -p ~/.local/bin
$ export PATH=~/.local/bin:"${PATH}"  # If ~/.local/bin not on PATH already
$ curl -sL https://raw.githubusercontent.com/matthewfeickert/cvmfs-venv/main/cvmfs-venv.sh -o ~/.local/bin/cvmfs-venv
$ chmod +x ~/.local/bin/cvmfs-venv

Use

Source the script to create a Python 3 virtual environment that can coexist with a CVMFS LCG view. The default name is venv.

$ cvmfs-venv --help
Usage: cvmfs-venv [-s|--setup] [--no-system-site-packages] [--no-update] [--no-uv] <virtual environment name>

Options:
 -h --help      Print this help message
 -s --setup     String of setup options to be parsed
 --no-system-site-packages
                The venv module '--system-site-packages' option is used by
                default. While it is not recommended, this behavior can be
                disabled through use of this flag.
 --no-update    After venv creation don't update pip, setuptools, and wheel
                to the latest releases. Use of this option is not recommended,
                but is faster.
 --no-uv        After venv creation don't install uv and use it to update pip,
                setuptools, and wheel. By default, uv is installed.

Note: cvmfs-venv extends the Python venv module and so requires Python 3.3+.

Examples:

    * Create a Python 3 virtual environment named 'lcg-example' with the Python
    runtime provided by LCG view 105 on AlmaLinux 9.

        setupATLAS -3
        lsetup 'views LCG_105 x86_64-el9-gcc12-opt'
        cvmfs-venv lcg-example
        . lcg-example/bin/activate

    * Create a Python 3 virtual environment named 'atlas-ab-example' with the
    Python runtime provided by ATLAS AnalysisBase release v25.2.15.

        setupATLAS -3
        asetup AnalysisBase,25.2.15
        cvmfs-venv atlas-ab-example
        . atlas-ab-example/bin/activate

    * Create a Python 3 virtual environment named 'venv' with whatever Python
    runtime "$(command -v python3)" evaluates to.

        cvmfs-venv
        . venv/bin/activate

    * Setup LCG view 105 on AlmaLinux 9 and create a Python virtual environment
    named 'lcg-example' using the Python 3.9 runtime it provides.

        . cvmfs-venv --setup "lsetup 'views LCG_105 x86_64-el9-gcc12-opt'" lcg-example

    * Setup ATLAS AnalysisBase release v25.2.15 and create a Python virtual
    environment named 'atlas-ab-example' using the Python 3.9 runtime it
    provides.

        . cvmfs-venv --setup 'asetup AnalysisBase,25.2.15' atlas-ab-example

Example: Virtual environment with LCG view

$ ssh lxplus
[feickert@lxplus924 ~]$ mkdir -p ~/.local/bin
[feickert@lxplus924 ~]$ export PATH=~/.local/bin:"${PATH}"
[feickert@lxplus924 ~]$ curl -sL https://raw.githubusercontent.com/matthewfeickert/cvmfs-venv/main/cvmfs-venv.sh -o ~/.local/bin/cvmfs-venv
[feickert@lxplus924 ~]$ chmod +x ~/.local/bin/cvmfs-venv
[feickert@lxplus924 ~]$ setupATLAS -3 --quiet
[feickert@lxplus924 ~]$ lsetup 'views LCG_105 x86_64-el9-gcc12-opt'
************************************************************************
Requested:  views ...
 Setting up views LCG_105:x86_64-el9-gcc12-opt ...
>>>>>>>>>>>>>>>>>>>>>>>>> Information for user <<<<<<<<<<<<<<<<<<<<<<<<<
************************************************************************
[feickert@lxplus924 ~]$ cvmfs-venv lcg-example
# Creating new Python virtual environment 'lcg-example'
[feickert@lxplus924 ~]$ . lcg-example/bin/activate
(lcg-example) [feickert@lxplus924 ~]$ python -m pip list --local  # Show installed defaults
Package    Version
---------- -------
pip        24.0
setuptools 69.5.1
uv         0.1.42
wheel      0.43.0
(lcg-example) [feickert@lxplus924 ~]$ python -m pip show hepdata-lib  # Still have full LCG view
Name: hepdata-lib
Version: 0.12.0
Summary: Library for getting your data into HEPData
Home-page: https://github.com/HEPData/hepdata_lib
Author: Andreas Albert, Clemens Lange
Author-email: hepdata-lib@cern.ch
License: UNKNOWN
Location: /cvmfs/sft.cern.ch/lcg/views/LCG_105/x86_64-el9-gcc12-opt/lib/python3.9/site-packages
Requires: future, hepdata-validator, numpy, PyYAML
Required-by:
(lcg-example) [feickert@lxplus924 ~]$ uv pip install --upgrade awkward
(lcg-example) [feickert@lxplus924 ~]$ python
Python 3.9.12 (main, Jul 11 2023, 14:44:04)
[GCC 12.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ROOT
>>> import XRootD
>>> import awkward
>>> exit()
(lcg-example) [feickert@lxplus924 ~]$ python -m pip show awkward  # Get version installed in venv
Name: awkward
Version: 2.6.4
Summary: Manipulate JSON-like data with NumPy-like idioms.
Home-page:
Author:
Author-email: Jim Pivarski <pivarski@princeton.edu>
License: BSD-3-Clause
Location: /afs/cern.ch/user/f/feickert/lcg-example/lib/python3.9/site-packages
Requires: awkward-cpp, fsspec, importlib-metadata, numpy, packaging, typing-extensions
Required-by: cabinetry, coffea, servicex, uproot_browser
(lcg-example) [feickert@lxplus924 ~]$ python -m pip list --local  # View of virtual environment controlled packages
Package            Version
------------------ --------
awkward            2.6.4
awkward-cpp        33
fsspec             2024.3.1
importlib_metadata 7.1.0
numpy              1.26.4
packaging          24.0
pip                24.0
setuptools         69.5.1
typing_extensions  4.11.0
uv                 0.1.42
wheel              0.43.0
zipp               3.18.1
(lcg-example) [feickert@lxplus924 ~]$ uv pip list  # uv will show the same view
Package            Version
------------------ --------
awkward            2.6.4
awkward-cpp        33
fsspec             2024.3.1
importlib-metadata 7.1.0
numpy              1.26.4
packaging          24.0
pip                24.0
setuptools         69.5.1
typing-extensions  4.11.0
uv                 0.1.42
wheel              0.43.0
zipp               3.18.1
(lcg-example) [feickert@lxplus924 ~]$ deactivate  # Resets PYTHONPATH given added hooks
[feickert@lxplus924 ~]$ python -m pip show awkward  # Get CVMFS's old version
Name: awkward
Version: 1.10.3
Summary: Manipulate JSON-like data with NumPy-like idioms.
Home-page: https://github.com/scikit-hep/awkward-1.0
Author: Jim Pivarski
Author-email: pivarski@princeton.edu
License: BSD-3-Clause
Location: /cvmfs/sft.cern.ch/lcg/views/LCG_105/x86_64-el9-gcc12-opt/lib/python3.9/site-packages
Requires: numpy, packaging
Required-by: cabinetry, coffea, servicex, uproot_browser

Dependencies

cvmfs-venv has no dependencies beyond the ones it aims to extend: A Linux operating system that has CVMFS installed on it with a Python 3.3+ runtime with a functioning venv module.

A full listing of all programs used outside of Bash shell builtins are:

Why is this needed?

When an LCG view or an ATLAS computing environment that uses software from CVFMS is setup, it manipulates and alters the PYTHONPATH environment variable. By placing the contents of all the installed software of an LCG view or ATLAS release onto PYTHONPATH for the rest of the shell session, the protections and isolation of a Python virtual environment are broken. It is not possible to fix this in a reliable and robust way that will not break the access of other software in the LCG view or ATLAS environment dependent on the Python packages in them. The best that can be done is to control the directory tree at the head of PYTHONPATH in a stable manner that allows for most of the benefits of a Python virtual environment (control of install and versions of packages, isolation of directory tree).

While lcgenv allows for package specific environment building, it still lacks the control to specify arbitrary versions of Python packages and will load additional libraries beyond what is strictly required by the target package dependency requirements. That being said, if you are able to use an LCG view or lcgenv without any additional setup, you may not have need of specifying a Python virtual environment.

While Python's venv module does have the --system-site-packages option to

Give the virtual environment access to the system site-packages dir.

this unfortunately isn't quite enough. It does allow for isolation to work, but the manipulation of PYTHONPATH makes it so that while packages can be installed properly in the local virtual environment and will show up with python -m pip list if there is another version of that package provided by the already setup environment that package version's location on PYTHONPATH will take precedence. Using --system-site-packages without cvmfs-venv is arguably even worse as it provides confusing differences in information between what pip has purported to install in the user's virtual environment and the user pip list view and the runtime environment.

Caveat: This is an LCG view specific issue mostly. If nothing from LCG is used (like a pure ATLAS AnalysisBase environment, or an environment in a Linux container) then --system-site-packages by itself should be sufficient.

How things work

cvmfs-venv provides a shim layer to manage activation and use of a Python virtual environment created with LCG view resources. It does this by copying the structure of the activate scripts generated by Python's venv module. When venv creates a virtual environment it generates a shell script under the virtual environment's directory tree at bin/activate. This activate script controls and edits the shell environmental variables PATH and PYTHONHOME — placing the virtual environment's bin/ directory onto PATH and unsetting PYTHONHOME upon activation, and restoring their original values when deactivate is run. cvmfs-venv simply extends this existing behavior to also place the virtual environment's site-packages/ directory onto PYTHONPATH during activation and to remove it on deactivation. This is done by injecting Bash snippets directly into the bin/activate script generated by venv at positions found relative to the manipulation of PYTHONHOME.

Advantages

Disadvantages

Citation

The preferred BibTeX entry for citation of cvmfs-venv is

@software{cvmfs-venv,
  author = {Matthew Feickert},
  title = "{cvmfs-venv: v0.0.7}",
  version = {0.0.7},
  doi = {10.5281/zenodo.7751033},
  url = {https://doi.org/10.5281/zenodo.7751033},
  note = {https://github.com/matthewfeickert/cvmfs-venv/releases/tag/v0.0.7}
}