colcon / colcon-core

Command line tool to build sets of software packages
http://colcon.readthedocs.io
Apache License 2.0
99 stars 44 forks source link

Disable sitecustomize paths if in virtual env #623

Closed LukaJuricic closed 1 month ago

LukaJuricic commented 4 months ago

Currently a sitecustomize python module is loaded during ament_python package installs. This impedes the use of a virtual environment during the install process, as is the case while using ament_virtualenv. Due to sitecustomize module path configuration, mimicking the venv one, even if a script is executed using the venv python binary, pyvenv.cfg is not found and the path configuration will not be correct.

With this fix, the sitecustomize path configuration will appy only while not using a virtual environment. In particular in the case of ament_virtualenv, the pip dependencies will be installed in the venv lib, while the modules of the package being built will still end up in the package install lib.

NB: ament_virtualenv is currently being ported to humble, I did not test it in the original targeted distribution (electron).

cottsay commented 4 months ago

This change entirely disables the mechanism by which colcon tells setuptools where to install the package. Without this, colcon will try to install to the interpreter default location, which is completely incorrect. Even when using a virtual environment as you're describing, the installation location will be incorrect. In some simple tests on my machine, this doesn't even solve the problem you're describing, and builds still aren't able to find packages provided by the venv.

Depending on how the virtual environment was created, there is another factor which is likely contributing to the issue you're facing.


Colcon always uses the same Python interpreter to build/install Python packages as the one which was used to invoke colcon itself. If you're using the colcon executable, it will therefore use the interpreter which was used when colcon-core was installed. This means that regardless of your virtual environment or what interpreter python or python3 are resolved to, your packages will probably not be built/installed using the interpreter you want.

This problem is a lot worse if your virtual environment uses a different Python ABI version from the one colcon is using. Even if you were able to force colcon to find the virtual environment's packages, they may not be usable by colcon's interpreter (as would be the case for compiled Python modules like the ones in platlib).

A naive suggestion might be to simply make colcon always use the python3 executable and not necessarily the one it was invoked with. Colcon itself loads setuptools to get information about the package prior to building it, so using a different interpreter (and likely setuptools version) for identification and metadata extraction from the one used to build the package sounds like a recipe for bugs. Additionally, not all systems use python3 as the "default" Python interpreter, namely Windows.

I don't think there's a perfect "solution" to this problem, but I can offer some workarounds that may work for you.

  1. Install colcon in your virtual environment, so that the default interpreter for the colcon executable is the one you want.
  2. When you want colcon to build packages using an alternative interpreter (such as the one provided in a virtual environment), use the module entry point to bypass the colcon executable that implicitly uses a different interpreter. While this is the cleanest solution, it does require that the venv interpreter can see all of the colcon packages and their dependencies, so --system-site-packages may be necessary.
$ which python3
/usr/bin/python3
$ python3 -m venv use_this --system-site-packages
$ . use_this/bin/activate
(use_this) $ which python3
~/use_this/bin/python3
(use_this) $ python3 -m colcon build ...
LukaJuricic commented 4 months ago

Scott, thank for having a look at this!

I'm a bit puzzled at your results, that's not at all what I get. I tried to:

  1. docker run -it ros:rolling bash
  2. install pip
  3. create a user and switch to it
  4. clone the patched colcon-core root and pip install .
  5. clone ros2/demos
  6. ~/.local/bin/colcon build --packages-select demo_nodes_py

I see the patched sitecustomize in build/demo_nodes_py/prefix_override, the modules in install/demo_nodes_py/lib/python3.10/site-packages/demo_nodes_py and the scripts in install/lib/demo_nodes_py.

As for ament_virtualenv, it works by calling colcon build in a shell without any virtual environment activated. Instead, the package to be installed must use a custom install command (provided by ament_virtualenv) in the setup.py, which adds a post-install script to the default install command. Only at this stage the virtual environment is created, the dependencies are installed in it and the shebangs of the scripts in install/lib/<pkg> are patched with the virtual environment python interpreter.

This means that the package is installed as a normal package, with the system python and the usual ROS install locations, while the only stage when the virtual environment python interpreter is used is to install the dependencies, through subprocess.call(["<pkg_install_dir>/venv/bin/python", "-m", "pip", "install", ...]). And this is exactly the stage where it fails without the proposed patch.

cottsay commented 1 month ago

I wasn't able to get ament_virtualenv to work, but I was able to reproduce the problem by attempting to use the venv it created while the sitecustomize was on PYTHONPATH. I see the problem now.

The breakage from the original patch that I was describing was verified by CI, where all of the python build tests were failing to find the package installed to the anticipated location (presumably colcon installed the package under /usr or /usr/local).


Since we know what the prefix is that we want to override, and we know that any subprocesses that activates a venv would change that prefix, we can just make the conditional compare to that instead.

Please verify that this change still resolves the issue for ament_virtualenv.

LukaJuricic commented 1 month ago

Awesome, thanks for taking a second look! I checked with our humble-ported version or ament_virtualenv, and it works. This reminded me that we still have to publish it :sweat_smile: . If anybody stumbles on this in the future, it will be most likely here.