pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.5k stars 3.02k forks source link

Add option and support to install and compile packages for a different Python interpreter #5472

Open avylove opened 6 years ago

avylove commented 6 years ago

What's the problem this feature will solve? pip doesn't work with older Python interpreters, but couples the interpreter for the install environment to the python interpreter running pip

Describe the solution you'd like Add an environment variable '--python' which identifies an alternative python interpreter. This alternative interpreter is only used to identify dependencies, wheel versions, and to compile installed packages while everything else is handled by the interpreter used to call pip.

Python 2.6 is currently not supported by pip and older versions of pip which do support Python 2.6 raise TLS/SSL warnings. This would allow a developer to pip install packages to unsupported environments like Python 2.6 using a newer version of Python and pip. A similar fate could be faced by Python 2.7 in the future or other Python interpreters unsupported by pip. This allows pip to continue supporting these interpreters without being stuck with maintaining compatibility with them.

Additionally, this would allow simplification of virtual environment creation such as those created with virtualenv or other bootstrapping tools.

Alternative Solutions

This behavior can be roughly approximated by using the '--target' option and running then compiling the code for the interpreter on the first run, but this option doesn't handle requirements specific to that version.

Additional context pip was downloaded from PyPI 98,814 for Python 2.6 in the last 30 days. There is still significant demand for unsupported platforms.

pradyunsg commented 6 years ago

Hi @avylove! Thanks for filing this feature request. :)

I don't see the need for the latest version of pip to support EOLed version of Python. #5172 is the place to have a discussion for possibly pip 9.0.4 for better SSL support on Python 2.6.

pradyunsg commented 6 years ago

If there's any motivation for this other than indirectly supporting Py2.6, please do mention the same. Otherwise, feel free to close this issue. :)

avylove commented 6 years ago

Sorry, I think my use of Python 2.6 as an example of use case was a little misleading that this would be the solitary goal.

There are two main goals that I can identify, perhaps someone else has others.

  1. Support for unsupported or non-optimally supported platforms.

    This includes older versions of cpython, but also could include alternative interpretors. I believe upip included with micropython-lib only supports packages without custom setup.py code. This feature could allow creation of micropython environments using standard pip.

  2. More streamlined virtual environments.

    Typically, pip is installed into virtual environments to facilitate installation of other packages. This would remove the need to do this and result in smaller virtual environments, which would be faster to create and smaller to distribute.

Both of these would have downstream benefits to virtualenv by allowing it to be more flexible in terms of what interpretors are supported, create smaller environments, and remove the wheels that are shipped with it.

Essentially, this feature makes pip a much more flexible tool.

pfmoore commented 6 years ago

My view:

  1. "Support for unsupported platforms" - pretty much by definition, we don't want to support those platforms, so that's not a particularly good motivating example.
  2. "More streamlined virtual environments" - it's not obvious what the benefit is of removing the need to have pip installed in virtual environments. I'm not aware of any pressing issues caused by the fact that pip is preinstalled in virtualenv and venv, so while I can see it might have been nice if things had been that way from the start, I'm not sure it's that helpful from where we are now.

I can understand the idea that it would be nice if pip could run as a standalone utility, and treat its target environment as a black box that it can query for the details it needs (platform, API, directory structure, etc). But there are some pretty strong reasons why pip doesn't do that, which basically all boil down to "there's no robust and reliable way to get the target Python to tell us the information we need". If you have a solution to that problem, then it just might be worth considering this proposal, although I suspect it would still struggle for acceptance because of the limited benefits and the significant disruption to the code base.

If you do want to look at what would be involved in querying a target Python for information, be prepared to address the following questions:

  1. Reliable inter-process communication of the data (that works across platforms, across Python versions, and taking into account the need to losslessly transfer the full range of Unicode characters, as they may be used in pathnames).
  2. Configuration of the environment in which the target interpreter gets run when it's being queried. You don't want a PYTHONHOME environment variable to fiddle with what the interpreter considers to be site-packages - or do you? You definitely don't want your carefully crafted Unicode transport mechanism see the previous point) to fail because the user has PYTHONIOENCODING set.
  3. Supporting all versions of Python (you were the one who mentioned Python 2.6!) even those that don't have the modern stdlib features that pip relies on currently (there's a reason why we don't support Python 2.6!)
  4. Performance - starting up a new process is a non-trivial cost, at least on some platforms, so a subprocess call every time you want to know the target's site-packages isn't likely to be acceptable.

There's probably quite a few others, these are just the major ones I've hit. And yes, I've done work in the past on introspecting Python installations "from the outside" and it's always been a much more painful process to do robustly than I'd have imagined.

I've added a "deferred until PR" label, as I don't think there's much that can be productively added here until someone comes up with at least a proof of concept demonstrating how a feature like this would be implemented in practice.

avylove commented 6 years ago

@pfmoore, I added a couple files at avylove/soccerfield@49652396d5cf552beb96585d388bfbc8546e3035 Not sure if I'm completely on the wrong track or not, but here's what I was thinking for introspection. I didn't have enough for a pull request and really just want to make sure this made sense before going farther.

Essentially, inspect_target.py calls target_report.py with the supplied Python executable. target_report.py should be relatively simple and targeted to be the most compatible piece of code.

Assuming you're in the directory containing both files.

$ python3.6 inspect_target.py /usr/bin/python2.6 {'python_implementation': 'CPython', 'python_version': '2.6.9', 'site_mod_dir': '/usr/lib64/python2.6', 'sys': {'base_prefix': None, 'prefix': '/usr', 'real_prefix': None}, 'sysconfig': {'include': '/usr/include/python2.6', 'platinclude': '/usr/include/python2.6', 'platlib': '/usr/lib64/python2.6/site-packages', 'platstdlib': '/usr/lib64/python2.6', 'purelib': '/usr/lib/python2.6/site-packages', 'stdlib': '/usr/lib64/python2.6'}, 'user_site': '/home/guido/.local/lib/python2.6/site-packages'}

Key points

  1. The introspection in pip seems to be called at various points. One goal here would be to consolidate that so it is all done at one time. If --python is given it is retrieved using a subprocess. If no --python option is given, the function would be called directly with the interpreter running pip. Either way, the same dictionary is returned and can be referenced where it is needed. This also allows for failing early in case the required paths can't be obtained.
  2. PYTHONPATH and PYTHONHOME are ignored when introspection is done in a subprocess. I think using them would lead to outcomes the user is not expecting and --target is already available for specifying alternative locations.
  3. JSON is used as transport format since it's supported in the standard library (since 2.6), easily converted to standard datatypes, and UTF-8 is the default encoding.
  4. In this example, I opened the file in text mode rather than binary since that's what the json module wants, but perhaps the JSON should be dumped as a string and written as UTF-8 encoded bytes? Not sure if that is required for portability or not.
  5. I used tempfile here, mainly to get around any console encoding. subprocess pipes could also be used. I assumed tempfile was less fragile and more portable.
  6. I skimmed the code to look for what information about the Python environment was being looked for. I'm not sure if I caught everything that was needed. I left out logic covering how to interpret the data since target_report.py should solely focus on collection.
  7. This approach won't work for every Python interpreter, but if it works for a good number without breaking current functionality, it's a win that can be improved upon over time
pfmoore commented 6 years ago

Some very brief comments (I don't have much time at the moment):

Sorry, no time to add more, but this is what I meant when I said "it's complicated" :-) All of these cases have come up in real bug reports (obviously in different situations, as this is new code) so the weird cases I'm quoting really do exist...

avylove commented 6 years ago

@pfmoore Thanks for the feedback. Completely understand not having much time.

Just a few followups

Aside from implementation details, do you see any reason why this general approach won't work?

pfmoore commented 6 years ago

Thanks for the clarifications. As I said, I had little time, so apologies for missing the details you pointed out!

Generally, I don't think this is a worthwhile change (I'm not interested in any of the benefits you suggest, and I'd probably be against it even if you did provide an implementation), so I'm reluctant to mislead you by sounding too encouraging. But I guess you have a reasonable approach to asking the target interpreter for its details, and any specific issues can be tidied up as they get spotted - so no, I don't see any problems with this as a general approach.

avylove commented 6 years ago

I have some other projects that I need to wrap up, but I think this is a worthwhile effort. It will be particularly important when Python 2 support is dropped completely, but I imagine there are use cases we haven't even thought of yet. Once I get some cycles free, I'll start putting a pull request together. Hopefully, I can find some other people to help with the effort.

pradyunsg commented 5 years ago

As @avylove predicted, this is one of the strategies that came up, when talking about pip and Python 2.7's EOL. :)

https://discuss.python.org/t/packaging-and-python-2/662

pradyunsg commented 5 years ago

There's clearly reignited interest in this. We've had some discussion about this on the thread linked above.

I think it's clear now, that there are multiple benefits to doing this.