Open paugier opened 3 years ago
I forgot to ask explicitly: I'd like to investigate what happens, but I don't know how to do it since it works fine with commands for which I have a bit of control (python setup.py develop
and pip install -e . --no-build-isolation
).
How can I try to understand/fix this issue?
I used
[build-system]
requires = ["setuptools>=49.5.0", "wheel",
"setuptools_scm[toml] @ file://localhost//home/pierre/Dev/setuptools_scm#egg=setuptools_scm",
"setuptools_scm_git_archive"]
to try to understand what happens during the isolated build in setuptools_scm.
setuptools_scm calls Mercurial during the build. Strangely, from the isolated build, mercurial can't see its extension hg-git. Mercurial is installed in its own (conda) environment and should not be influenced by the isolated build.
It seems to be related to the fact that hg-git was installed in the Mercurial environment with pip install -e .
. If I reinstall hg-git with pip install hg-git -U
(of course, in the Mercurial environment), Mercurial can correctly see hg-git from the isolated build.
Changing the behavior of an independent application is indeed a very strange behavior of an isolated build.
This sounds like an issue with hg-git or setuptools. The main difference between installing things with and without -e
is the excutable generated by latter is (currently) done by setuptools, not pip.
Actually, there is no need to add the -e
to get the problem: even when I install snek5000
with pip install .
, I get the bug (i.e., during the isolated build, Mercurial cannot import an extension installed in its environment with pip install -e .
).
I'm going to try to provide a cleaner way to reproduce.
I don't think the problem can be due to anything in hg-git. hg-git is just a simple python package and Mercurial just detects hg-git by trying to import it. It seems that Mercurial gets an ImportError when the command hg
is run from an isolated build.
The system mercurial might need isolation from the virtualenv used by the build isolation, This may require additional tooling in setuptools_scm
Here is a simple way to reproduce something similar with Ubuntu 20.04: https://github.com/paugier/reproducer-bug-isolated-build-hg/blob/main/.github/workflows/ci.yml
I used the system Mercurial /usr/bin/hg and install hg-git with pip2 (since /usr/bin/hg still uses Python 2.7 in Ubuntu 20.04). pip2 automatically installs hg-git in --user
mode. It is a very standard way to install Mercurial extensions (see https://foss.heptapod.net/mercurial/hg-git/-/tree/branch/0.10.x and https://foss.heptapod.net/mercurial/evolve).
We see here that the isolated build breaks the application Mercurial, which cannot import its extensions.
That's because (see here) pip runs the build in an environment where $PYTHONNOUSERSITE
is set to 1, to protect the build backend environment from being "polluted" by packages installed outside of the build environment.
I'm not sure what the correct solution is here. IMO, a system tool like /usr/bin/hg
should not be affected by the settings of Python environment variables like PYTHONNOUSERSITE
, as those are intended to allow the user to control the behaviour of the Python interpreter (which is how pip is using them). I'd argue that the executable wrapper for Mercurial should set the Python environment variables to a "known state", rather than letting the parent process' values leak through. But that's easy for me to say, knowing what pip is trying to do, and there are almost certainly other considerations on the Mercurial side of things.
I don't think this is a pip issue as such (you can get the same problem just by manually setting PYTHONNOUSERSITE
in your environment) but maybe we need a standard (essentially an add-on to PEP 517) clarifying a bit further what the environment in which a PEP 517 build backend should be run must look like. Agreeing such a standard would thrash out details like this in a way that all tools can rely on, rather than having it be a pip-specific implementation detail.
mercurial uses a custom script for its binry that is all python
my basic impression is that any progeam tats not explicitly isolating its executable from surrounding context is affected
i wonder if its a reasonable workaround to hide the NOUSERSITE env var in setuptoos_scm either via opt-in or via opt-out
Such workaround in setuptools_scm would quickly fix the issue for users of hg-git. If we wait for a fix in Mercurial, this bug will be there for years even for not so old distributions.
Note that we would also need for hg
calls to remove from PYTHONPATH the path looking like /tmp/pip-build-env-veix2dp_/site
. It contains a file sitecustomize.py
which, IMHO, does not make sense for Mercurial.
In the principle, it's a bit strange to also isolate at the level of applications called during the build. For example, nothing is done to really isolate Git or compilers. If the Python API of Mercurial was used during the build, isolation would clearly be good, but here, it's used as an application. Passing to hg
the environment variables used to isolate the build environment is actually weird.
IMO, if an application is delivered as a standalone utility, you shouldn't be able to tell what language it's written in. So ignoring general Python environment variables should be the norm. And certainly, the application shouldn't stop working just because the user sets language-specific variables. Having said that, I accept this is not general practice. And I understand that the practicalities mean that it may be necessary to work around things at a higher level.
But if we want to make this robust (by which I mean, something that works in all tools - build as well as pip, for example) we need to agree on a set of expectations for how tools set up the build environment, and that needs standardising.
It's a common problem that python applications typically do not protect against the environment
The only distro that I'm aware of that manages it is nixos, and that one doesn't share build isolation behaviour with any of the other well known systems
Build tools will have to be unnecessary smart about this
[Offtopic] I wonder whether pip's entry point wrappers should explicitly unset all Python-specific environment variables before running the Python interpreter? It would probably break too many applications...
Would need a new entrypoint and a pep
Would switching to use venv (which does not require us setting PYTHONNOUSERBASE
) fix this? See discussion in #6264 as well
Would switching to use venv (which does not require us setting PYTHONNOUSERBASE) fix this? See discussion in #6264 as well
Yes, it seems that using venv would fix this issue. venv does not set PYTHONPATH
nor PYTHONNOUSERBASE
.
$ time python -m venv tmp_venv_call_hg --without-pip
real 0m0.330s
user 0m0.260s
sys 0m0.078s
$ . tmp_venv_call_hg/bin/activate
$ python -c "from subprocess import run; run('hg version -v --config extensions.hggit='.split())"
Mercurial Distributed SCM (version 5.6.1)
(see https://mercurial-scm.org for more information)
[...]
Enabled extensions:
hggit external 0.10.2 (dulwich 0.20.25)
[...]
I try to summarize. I think this issue is now quite well understood.
It is related to the custom isolation used by pip and to the sensibility of (at least some) Mercurial installations to environment variables like PYTHONPATH
and PYTHONNOUSERBASE
.
The solutions could be:
Improve Mercurial in terms of ignoring Python specific environment variables. I don't know at which level it should be done. I guess it depends somehow on the installation method. Even if the next version of Mercurial is improved, the issue will continue to be there for most users since people tend to use quite old versions of hg.
Implement the Mercurial isolation at the setuptools_scm level, i.e. remove PYTHONNOUSERBASE
and clean PYTHONPATH
for the environment used to call hg
. It's technically very simple (I can even submit a PR) and could also work with other installation tools for installation using setuptools_scm.
Use internally in pip a proper virtual environment created with venv --without-pip
(no need to use/change PYTHONNOUSERBASE
and PYTHONPATH
). It would fix other similar problems, in particular for other Python applications used during build. It would not fix the problem for other install tools, except if there is also a pep on "expectations for how tools set up the build environment".
One thing I want to be sure is whether a proper virtual environment indeed correctly ignore user-site packages without PYTHONNOUSERBASE
(this is the reason why we need to set that flag right now). I’m about 99.9% certain it does, but someone should make sure.
After that, we can wait on pypa/build#361 and transplant that to pip to make everything work properly.
❯ py -m pip list -v
Package Version Location Installer
---------- ------------ ------------------------------------------------------------------------- ---------
pip 21.3.1 c:\users\pfm\appdata\local\programs\python\python39\lib\site-packages pip
setuptools 58.5.3 c:\users\pfm\appdata\local\programs\python\python39\lib\site-packages pip
Spans 1.1.1 c:\users\pfm\appdata\roaming\python\python39\site-packages pip
tzdata 2021.2.post0 c:\users\pfm\appdata\local\programs\python\python39\lib\site-packages pip
wheel 0.37.0 c:\users\pfm\appdata\local\programs\python\python39\lib\site-packages pip
PS 11:57 00:01.480 C:\Work\Support
❯ py -m venv xx
PS 11:58 00:03.809 C:\Work\Support
❯ .\xx\Scripts\pip.exe list -v
Package Version Location Installer
---------- ------- ------------------------------------ ---------
pip 20.2.3 c:\work\support\xx\lib\site-packages pip
setuptools 49.2.1 c:\work\support\xx\lib\site-packages pip
WARNING: You are using pip version 20.2.3; however, version 21.3.1 is available.
You should consider upgrading via the 'c:\work\support\xx\scripts\python.exe -m pip install --upgrade pip' command.
PS 12:02 00:01.631 C:\Work\Support
❯ dir env:PYTHON*
PS 12:02 00:00.005 C:\Work\Support
❯
Is that a sufficient check? Note that spans is in user site-packages in the system environment.
Yeah looks right to me!
Description
I try to install a package (https://github.com/exabl/snek5000) with
pip install -e .
cloned with Mercurial and hg-git.This package uses setuptools_scm to detect its version. setuptools_scm now supports using Mercurial as a Git client and provides the right version when using the commands
python setup.py develop
andpip install -e . --no-build-isolation
.However, if I just use
pip install -e .
, the package is correctly installed but the detected version is completely wrong.Expected behavior
No response
pip version
pip 21.3.1
Python version
CPython 3.9
OS
Linux
How to Reproduce
With Mercurial setup to work with hg-git
Output
No response
Code of Conduct