ctapobep / blog

My personal blog on IT topics in the form of GitHub issues.
6 stars 0 forks source link

Managing environments in Python #17

Open ctapobep opened 1 year ago

ctapobep commented 1 year ago

Since Python environment/package management system is a confusing mess with lots of tools with similar names, I had to write this post to document the tools and the way to work with them. While a lot of this is available in the official docs, 2 resources that were especially helpful are Bernat Gabor - status quo of virtual environments and this SO post. Maybe this is going to be useful for others as well.

Short story of Poetry

If you have vague understanding of virtualenv and venv, skip this part first and return to it later. Here's a short version of how things work with poetry:

  1. When Poetry is installed, it creates a virtual env with venv. It places its own package and its dependencies into that env.
  2. It creates an "executable" poetry (a shell script actually) and appends it to PATH. This allows poetry command to execute, which then runs python from venv that it set up at the beginning.
  3. The script turns commands like poetry install into /path/to/venv/python poetry install
  4. But our project itself (dependency, installation, tests, build) should be able to use a different Python and a different site-packages. So Poetry itself has to start a yet another python process - this time it leverages virtualenv. You specify which Python to use by poetry env use /path/to/python. When initializing the project, it can create the env on its own, but we can easily change that afterwards.

Probably the setup which is the least confusing would be:

  1. Install pyenv, then download the required Python: pyenv install [version]
  2. Install virtualenv and use it to create an environment right inside the project: virtualenv -p $(pyenv root)/versions/[version that you want]/bin/python .venv
  3. Tell Poetry to use the .venv inside the project: poetry config virtualenvs.in-project true

This way you can:

  1. Easily manage (re-create) virtualenvs using virtualenv instead of adding yet another layer of poetry commands
  2. Activate the same environment with the usual Python tools (outside Poetry) by source .venv/bin/activate. Alternatively, you can always run poetry run [same command here]

Tools to manage Python environment

Managing site-packages

virtualenv, venv - create additional site-packages for our project, separate it from the system (global) site-packages. When creating an env with virtualenv, it's possible to specify different local installations of Python to start with passing -p /path/to/python.

These tools can't install different Python versions (though virtualenv may add such feature at some point) - they can only use the existing ones.

Managing versions of Python

pyenv installs/activates different versions of Python instead of you doing this with "manually" (though standard OS tools aren't too complicated). It doesn't manage site-packages in any way. It's an analog of NodeJS nvm.

To start with it:

  1. Install pyenv
  2. pyenv install -l list all Python versions available for installation
  3. pyenv install [version]
  4. pyenv versions to list already installed versions (which were installed on prev step)
  5. pyenv local [version] or python global [version] if you want to set current version of Python in current directory/project (it creates a .python-version file inside the current dir) or overall for the system

How these tools work internally (optional reading)

Types of modules in Python:

import sys
for importer in sys.meta_path:
    print(repr(importer))

# plain Python:

<class '_frozen_importlib.BuiltinImporter'>
<class '_frozen_importlib.FrozenImporter'>
<class '_frozen_importlib_external.PathFinder'>

# with virtualenv:
<_distutils_hack.DistutilsMetaFinder object at 0x10a1a3310>
<_virtualenv._Finder object at 0x10a1936a0>
<class '_frozen_importlib.BuiltinImporter'>
<class '_frozen_importlib.FrozenImporter'>
<class '_frozen_importlib_external.PathFinder'>

PathFinder is the most interesting, it looks into sys.path to find the modules. It goes from top to bottom ('' means current dir):

import sys
for path in sys.path:
    print(repr(path))

# plain Python

''
'/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python39.zip'
'/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9'
'/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload'
'/usr/local/lib/python3.9/site-packages'

# with virtualenv
''
'/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python39.zip'
'/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9'
'/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload'
'/Users/stas/projects/python-sandbox/proj1/lib/python3.9/site-packages'

Note, that we can ask virtualenv to append the global site-packages of the system Python.

sys.prefix defines where to search for standard modules and global site-packages. It defaults to:

So to isolate the environment we need to be able to modify sys.path and (optionally).

venv

Works starting with Python 3.4 (?). I's a standard module, and it sues a file pyvenv.cfg (PEP-405):

home = /usr/local/opt/python@3.9/bin
implementation = CPython
version_info = 3.9.9.final.0
virtualenv = 20.17.1
include-system-site-packages = false
base-prefix = /usr/local/opt/python@3.9/Frameworks/Python.framework/Versions/3.9
base-exec-prefix = /usr/local/opt/python@3.9/Frameworks/Python.framework/Versions/3.9
base-executable = /usr/local/opt/python@3.9/bin/python3.9

Biggest limitation - it can only create virtual envs of the same Python version as the current one.

virtualenv

virtualenv is a 3d-party tool. When creating an env it:

  1. Copies some parts of the existing python installation (the executable)
  2. Soft-links some other parts (like the landmarks)
  3. Creates some folders (like site-packages) & configs
  4. And it has to create its own site.py to modify the default behaviour of Python - e.g. to optionally append the global site-packages to the sys.path

And when you run source my_env/bin/activate it modifies PATH to point to the env folder.