Support `uv` as an alternative package installer

pfmoore commented 5 months ago

uv is much faster than pip when installing packages, and therefore makes creating transient environments much quicker. For example pip-run scipy -- -c pass takes 16 seconds using pip, with all the relevant packages cached, but takes under 1 second with uv (also with everything cached).

A proof of concept is as simple as changing the command in pip_run.deps.load, from

cmd = (sys.executable, '-m', 'pip', 'install', '-t', sp(target)) + args

to

cmd = ('uv', 'pip', 'install', '-q', '--target', sp(target)) + args

With this change, only one test fails:

FAILED tests/test_scripts.py::test_pkg_loaded_from_alternate_index - AssertionError: assert 'devpi.net' in 'Import su...

This seems to be shallow, as it's failing because the uv installation output differs from pip's, rather than because of a functionality issue.

A more robust implementation would likely need to include a configuration option to choose uv or pip, and maybe bundle a private copy of uv.

I'd be happy to discuss details of the best UI for this, and provide a PR, if you think this is a reasonable feature to add.

jaraco commented 4 months ago

Thanks for proposing this idea! I haven't spent a lot of time yet thinking about how this tool could/should interface with uv, so your exploration is very much welcome.

I have been thinking about uv and what the implications are. Specifically, because pip-run is most useful when it's available in a given environment, I'd like for it (or something like it) to be available to users (e.g. something like pip run or uv run).

One big difference I've noticed and still need to understand more fully is that uv takes a different perspective on its installation demands. uv doesn't have to be installed in any particular environment, meaning there's no implicit Environment to target. In other words, when I run py -3.10 -m pip-run ..., I expect the command to run in the context of the Python 3.10 interpreter and its site-packages.

I need to learn more about how uv handles those scenarios, and maybe shift the strategy of pip-run to do something similar. If pip-run could be installed to a single location but then apply to any Python environment, that would be very much preferable to the current strategy of requiring pip-run to be present in each environment.

pfmoore commented 4 months ago

It's also worth noting that pip doesn't need to be installed in the environment these days (and personally I prefer not to do so).

I agree that tools that "act on the environment" are in a somewhat odd position here. Jupyter/IPython is another example I have trouble with - I want to have it available everywhere, without having to install it into the environment (especially as it's a big, complex set of packages - I was going to say "pip-run doesn't have that issue" but with 25 packages in a pip-run installation, I'm not sure that's true...)

I'd like to see a standard behaviour for tools when they need to discover the appropriate Python environment to work on. I think uv has a fairly reasonable approach although I'm sure that if we tried to standardise this, people would have differing views 🙂 For Python-based tools, there's historically been a tendency to use "the interpreter I'm running with" as a default, but that (a) doesn't translate well to non-Python tools, and (b) pollutes the user's environment with the tool's code and dependencies.

pfmoore commented 4 months ago

Also, regarding discovering the interpreter to use, having a --python option (short form -p) to explicitly specify the venv or executable to use is probably worthwhile (both pip and uv have this).

jaraco commented 4 months ago

with 25 packages in a pip-run installation

That number seemed high to me, so I checked and for a recent Python, the number of packages installed is 13 (14 if you include pip). On Python 3.8, it's 18 (due to tomli, importlib_metadata, importlib_resources, zipp, and backports.tarfile). It's quite possible when you last checked, there were 25 dependencies.

I don't see a large number of (maintained) dependencies as a problem as much as a sign of a healthy project. uv, for example, has 169 dependencies. I'm hoping we can get to a place in the Python world where (like Node and Rust) having dependencies is seen as healthy re-use and not tech debt.

I'd like to see a standard behaviour for tools when they need to discover the appropriate Python environment to work on. I think uv has a fairly reasonable approach although I'm sure that if we tried to standardise this, people would have differing views 🙂 For Python-based tools, there's historically been a tendency to use "the interpreter I'm running with" as a default, but that (a) doesn't translate well to non-Python tools, and (b) pollutes the user's environment with the tool's code and dependencies.

Also, regarding discovering the interpreter to use, having a --python option (short form -p) to explicitly specify the venv or executable to use is probably worthwhile (both pip and uv have this).

All good suggestions. I'll consider them in #104 and use this ticket to focus on uv support.

jaraco commented 4 months ago

I'd be happy to discuss details of the best UI for this, and provide a PR, if you think this is a reasonable feature to add.

Yes, please proceed with discussions about the UI and a PR. Here are a couple of considerations that come to mind.

Let's consider if we could somehow autodetect which installer the user would prefer to use. I'm not sure we can, but I'd like to have friendly defaults.

Another thing to consider - how does uv know for which Python to resolve dependencies? In the proof-of-concept, uv will install dependencies based on the default Python, which might be 3.13, but then pip-run could be under Python 3.7 and fail to run due to unresolved dependencies on that older Python. Probably any solution will need to pass --python {sys.executable} to uv (and maybe do the same with pip for symmetry).

pfmoore commented 4 months ago

Just a quick comment about which Python to use.

My expectation (which doesn't match the reality right now) is that if I type pip-run requests I will get a REPL with requests available, using the same Python I'd normally expect. Which might well not be the one pip-run is using. This is exactly the same expectation as I have when running pip install or uv pip install, which will be discussed in #104

jaraco commented 4 months ago

I will get a REPL with requests available, using the same Python I'd normally expect.

I'm not sure "the same Python I'd normally expect" is a well-defined construct. There are a number of factors that affect which interpreter a user might expect:

The command they type. A few possible commands include python, python3, and py.
Which commands are available on the platform they're on. Windows users typically get py by default but not python3 (last I checked). Unix users typically get python3 by default but not py.
uv solves this issue by defining a process that's variable by platform but tries to honor that platform's conventions.

In my opinion, having ever divergent processes across Windows and Linux is a bug and ideally the tooling should work toward convergence rather than perpetuating the platform-specific behaviors.

pip-run helps converge these behaviors by providing pip-run on all platforms as well as -m pip-run on all platforms (regardless of how one invokes Python), but that all relies on (a) pip-run being available in the target environment and (b) pip being installed to the target environment.

If we introduce uv as an alternative installer, that brings in even more divergence, as now pip-run needs to either provide a convergent behavior across pip and uv or allow behavior to vary depending on which installer is present/elected. I'd lean toward the former, but I can easily imagine users wanting to have their installer's idiosyncrasies honored.

Given the entanglement between this issue and #104, I'm not confident we can come up with a durable solution to adopt uv without first addressing #104.

Please consider articulating your design/approach before spending too much time engineering. I think we may want to consider several different approaches before adopting one.

pfmoore commented 4 months ago

Please consider articulating your design/approach before spending too much time engineering. I think we may want to consider several different approaches before adopting one.

No problem. It may take some time before I get round to this as life is a bit busy at the moment. I'll make sure I discuss design before implementing anything.

While I understand where you're coming from with regard to #104[^1], I don't think it's the job of this tool to resolve the problem of how people invoke Python. That's very much the core developers issue, and if they don't resolve it, IMO we should work with existing conventions, no matter how much we may disagree with them, rather than fighting the ecosystem.

With that in mind, my view is:

Platform differences are inevitable, simply because the relevant PEPs (394 for Unix, 397 for Windows) mandate different behaviours across platforms.
That means that, in the absence of an activated virtual environment, the "system environment" for Windows users is whatever the py launcher picks, or if the launcher isn't available, whatever usable python command is available on PATH[^2]. For Unix, the system environment should be whatever python3 executable is on PATH, and if none is available, whatever python executable is present (I'd be willing to prefer python over python3 if people more familiar with Unix thought that was a reasonable thing to do nowadays).

If there is an activated virtual environment, that is what should be used. Detecting an activated virtual environment, given that pip-run could be running in its own, different, environment, could be problematic, though. A practical solution would be to check the VIRTUAL_ENV environment variable, as the standard activation scripts set that, and entry point wrappers, such as the ones that pipx uses, do not.

To make this work, we should always run the installer (whether pip or uv) with the --python option explicitly specifying the interpreter we want to use. We should never expose the installer's default behaviour.

We could consider extensions to this behaviour, such as uv's approach of automatically detecting a .venv directory and using it even when it's not activated. I'm not sure that's necessary for a tool like pip-run, which doesn't modify the environment except temporarily. But if it becomes a common and popular behaviour, we could adopt it at a later date.

This comment turned out to be sufficiently long and detailed that I'm going to copy it to #104, for completeness...

[^1]: This comment may be better on that issue, but I'd rather keep my response next to your comment, I hope that's OK. [^2]: There's a potential issue with this as Microsoft provide the python wrapper that opens the Windows Store. I'll ignore that for now, in the interests of simplicity. I'd be inclined to ask Steve Dower for advice on dealing with that, if necessary.

jaraco / pip-run

Support `uv` as an alternative package installer #100