Open qwhelan opened 4 years ago
Thanks for a super detailed bug report @qwhelan!
pip download
does not store the build dependencies for the package; which IIUC, is what this issue is about.
I'm personally not sure about how pip download should handle build dependencies, so I'll think a bit more about this before coming back.
@pradyunsg I think downloading the build dependencies of anything that uses PEP 518 and isn't satisfied by a wheel would be a reasonable choice. If the main idea is that pip download
is used for caching everything you need to build and install the projects you're downloading, then the build-time dependencies need to be satisfied as well if they're needed.
(Obviously this won't work for projects that don't use PEP 518, but that seems like a good incentive for those projects to opt in.)
@uranusjr I tried to fiddle with pip a bit, and saw that when downloading numpy
for example with --no-bin
, we get:
python3 src/pip download numpy --no-binary numpy -d /tmp
Collecting numpy
Downloading numpy-1.18.2.zip (5.4 MB)
|████████████████████████████████| 5.4 MB 466 kB/s
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Saved /tmp/numpy-1.18.2.zip
Successfully downloaded numpy
Seems to me like we install build deps and build wheel requirements even though we should instead download them, and not install them, or am I just missing something... I am probably just a noob with all the resolving and reqs code but maybe you could explain to me why installing deps is needed in that case
@pradyunsg I tried to look at the relevant code and ultimately failed :( I do not know pip internal resolving so much, but this problem interested me. Do you have any ideas on this maybe? :smile:
@NoahGorny The issue here is that there's 2 kinds of dependencies: build-time dependencies and run-time dependencies. pip download
only downloads the install time dependencies to the relevant directory... since for a long time, the Python ecosystem did not have any good way to specify build-time dependencies.
That changed with PEP 518 and PEP 517, which added the ability to specify build-time dependencies, and a very compelling reason to use them. However, the pip download
command did not start adding the build dependencies to the output folder. This gap is what we'd want to fix as part of this issue -- to store the build-time dependencies used and dump those distributions into the download destination, in addition to the run-time dependencies. The main thing to do for this is to keep track of what we download to build any package, and add then move them to the download directory.
However, this tracking could be very difficult given how pip's implementation works. pip's current approach for implementing the build isolation logic is dependent on (recursive) subprocesses [see req_tracker, build_env and stuff that they interact with] with no real inter-process-communication. I'm personally not able to think of a "quick to do and an not-ridiculously-difficult-to-implement-and-maintain" approach exists for actually implementing the tracking-of-build-dependencies for pip download.
FWIW, once we get a certain amount of cleanup (i.e. stop spawning a subprocess and do the installations in the build environment in-process w/ a stack, and all the supporting refactoring), it should be possible to do so relatively easily. This is however a non-trivial task though and certainly not "quick to do". :)
Hello! I want to share another usecase for this. Flatpaks require a fully offline build environment, where all dependencies are pre-fetched. They created scripts to do this, however due to this issue there is currently no way to do this for build dependencies.
The more packages now migrate to isolated builds the harder it becomes to maintain Flatpaks.
Just ran into this problem myself.
Might it be possible to have some way of running pip download
with a flag pointing just to whichever build dependencies are stated in the pyproject.toml
file? This doesn't necessarily directly solve the issue that build dependencies are not included in plain old pip download
, but at least would offer a quick and potentially more straightforward way to grab them all for offline installs.
@pradyunsg, since you said the recursive calls of the build isolation logic subprocesses might be tricky to finesse, I'm thinking that this might instead just copy the process that pip download -r requirements.txt
uses... Just instead of -r requirements.txt
we'd have something like -b pyproject.toml
that will simply install whatever the package lists as
[build-system]
requires = ["..."]
(and maybe the pyproject.toml
isn't necessary after the -b
flag, since that seems almost implicit)
This wouldn't reproduce all the logic—just like requirements files don't reproduce logic to my knowledge— but maybe it's sufficient for most simple sets of build dependencies?
I'm not sure exactly how pip implements pip download -r requirements.txt
so I may be grossly misunderstanding something, but I'm happy to do more digging if this option sounds potentially appealing.
Hi! Is there an update on the state of the rewrites required to fix pip download
?
This issue and https://github.com/pypa/pip/issues/1884 are severely impacting Flatpak builds for Python apps. With the fact that now even some "core" Python packages have switched to using build dependencies and there not being any solution at the moment, this is getting worse by the day sadly.
Sadly, no. If there was one, it'd be reported here.
Environment
Description An issue was reported in my repo https://github.com/pydata/bottleneck/issues/333 by a user utilizing
pip download
to create a local cache prior to installation, but was seeing the following error despite having an up-to-date copy ofsetuptools
:I was able to reproduce the behavior via a
Dockerfile
and identify that the root issue is thatpip download
is not fetching the PEP 517 dependencies. This error message appears to be reproducible for other packages (such asnumpy
) provided thatpip download
fetches a source package that requires a PEP 517 build step.My findings can be found here: https://github.com/pydata/bottleneck/issues/333#issuecomment-599177688
Expected behavior
As far as expected behavior, I would expect
pip download
to fetch all packages needed to successfully install the target package(s), including build-only dependencies.How to Reproduce
Dockerfile
:Output: