pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.54k stars 3.04k forks source link

Build dependencies are not downloaded by `pip download` #7863

Open qwhelan opened 4 years ago

qwhelan commented 4 years ago

Environment

Description An issue was reported in my repo https://github.com/pydata/bottleneck/issues/333 by a user utilizing pip download to create a local cache prior to installation, but was seeing the following error despite having an up-to-date copy of setuptools:

  ERROR: Could not find a version that satisfies the requirement setuptools (from versions: none)
  ERROR: No matching distribution found for setuptools

I was able to reproduce the behavior via a Dockerfile and identify that the root issue is that pip download is not fetching the PEP 517 dependencies. This error message appears to be reproducible for other packages (such as numpy) provided that pip download fetches a source package that requires a PEP 517 build step.

My findings can be found here: https://github.com/pydata/bottleneck/issues/333#issuecomment-599177688

Expected behavior

As far as expected behavior, I would expect pip download to fetch all packages needed to successfully install the target package(s), including build-only dependencies.

How to Reproduce Dockerfile:

FROM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04
RUN apt-get update
RUN apt-get install -y python3 python3-pip
RUN pip3 install -U pip
RUN cd /tmp && python3 -m pip download numpy --no-binary numpy -d ./
RUN ls -alh /tmp
RUN python3 -m pip install --no-binary numpy numpy --find-links /tmp --no-index
RUN pip3 list

Output:

Step 7/8 : RUN python3 -m pip install --no-binary numpy numpy --find-links /tmp --no-index
 ---> Running in b34d3e92b764
Looking in links: /tmp
Processing /tmp/numpy-1.18.1.zip
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-6p0zdywu/overlay --no-warn-script-location --no-binary numpy --only-binary :none: --no-index --find-links /tmp -- setuptools wheel 'Cython>=0.29.14'
       cwd: None
  Complete output (3 lines):
  Looking in links: /tmp
  ERROR: Could not find a version that satisfies the requirement setuptools (from versions: none)
  ERROR: No matching distribution found for setuptools
  ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-6p0zdywu/overlay --no-warn-script-location --no-binary numpy --only-binary :none: --no-index --find-links /tmp -- setuptools wheel 'Cython>=0.29.14' Check the logs for full command output.
The command '/bin/sh -c python3 -m pip install --no-binary numpy numpy --find-links /tmp --no-index' returned a non-zero code: 1
pradyunsg commented 4 years ago

Thanks for a super detailed bug report @qwhelan!

pip download does not store the build dependencies for the package; which IIUC, is what this issue is about.

I'm personally not sure about how pip download should handle build dependencies, so I'll think a bit more about this before coming back.

pganssle commented 4 years ago

@pradyunsg I think downloading the build dependencies of anything that uses PEP 518 and isn't satisfied by a wheel would be a reasonable choice. If the main idea is that pip download is used for caching everything you need to build and install the projects you're downloading, then the build-time dependencies need to be satisfied as well if they're needed.

(Obviously this won't work for projects that don't use PEP 518, but that seems like a good incentive for those projects to opt in.)

NoahGorny commented 4 years ago

@uranusjr I tried to fiddle with pip a bit, and saw that when downloading numpy for example with --no-bin, we get:

python3 src/pip download numpy --no-binary numpy -d /tmp
Collecting numpy
  Downloading numpy-1.18.2.zip (5.4 MB)
     |████████████████████████████████| 5.4 MB 466 kB/s
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
  Saved /tmp/numpy-1.18.2.zip
Successfully downloaded numpy

Seems to me like we install build deps and build wheel requirements even though we should instead download them, and not install them, or am I just missing something... I am probably just a noob with all the resolving and reqs code but maybe you could explain to me why installing deps is needed in that case

NoahGorny commented 4 years ago

@pradyunsg I tried to look at the relevant code and ultimately failed :( I do not know pip internal resolving so much, but this problem interested me. Do you have any ideas on this maybe? :smile:

pradyunsg commented 4 years ago

@NoahGorny The issue here is that there's 2 kinds of dependencies: build-time dependencies and run-time dependencies. pip download only downloads the install time dependencies to the relevant directory... since for a long time, the Python ecosystem did not have any good way to specify build-time dependencies.

That changed with PEP 518 and PEP 517, which added the ability to specify build-time dependencies, and a very compelling reason to use them. However, the pip download command did not start adding the build dependencies to the output folder. This gap is what we'd want to fix as part of this issue -- to store the build-time dependencies used and dump those distributions into the download destination, in addition to the run-time dependencies. The main thing to do for this is to keep track of what we download to build any package, and add then move them to the download directory.

However, this tracking could be very difficult given how pip's implementation works. pip's current approach for implementing the build isolation logic is dependent on (recursive) subprocesses [see req_tracker, build_env and stuff that they interact with] with no real inter-process-communication. I'm personally not able to think of a "quick to do and an not-ridiculously-difficult-to-implement-and-maintain" approach exists for actually implementing the tracking-of-build-dependencies for pip download.

FWIW, once we get a certain amount of cleanup (i.e. stop spawning a subprocess and do the installations in the build environment in-process w/ a stack, and all the supporting refactoring), it should be possible to do so relatively easily. This is however a non-trivial task though and certainly not "quick to do". :)

theCapypara commented 1 year ago

Hello! I want to share another usecase for this. Flatpaks require a fully offline build environment, where all dependencies are pre-fetched. They created scripts to do this, however due to this issue there is currently no way to do this for build dependencies.

The more packages now migrate to isolated builds the harder it becomes to maintain Flatpaks.

mitchnegus commented 11 months ago

Just ran into this problem myself.

Might it be possible to have some way of running pip download with a flag pointing just to whichever build dependencies are stated in the pyproject.toml file? This doesn't necessarily directly solve the issue that build dependencies are not included in plain old pip download, but at least would offer a quick and potentially more straightforward way to grab them all for offline installs.

@pradyunsg, since you said the recursive calls of the build isolation logic subprocesses might be tricky to finesse, I'm thinking that this might instead just copy the process that pip download -r requirements.txt uses... Just instead of -r requirements.txt we'd have something like -b pyproject.toml that will simply install whatever the package lists as

[build-system]
requires = ["..."]

(and maybe the pyproject.toml isn't necessary after the -b flag, since that seems almost implicit)

This wouldn't reproduce all the logic—just like requirements files don't reproduce logic to my knowledge— but maybe it's sufficient for most simple sets of build dependencies?

I'm not sure exactly how pip implements pip download -r requirements.txt so I may be grossly misunderstanding something, but I'm happy to do more digging if this option sounds potentially appealing.

theCapypara commented 3 months ago

Hi! Is there an update on the state of the rewrites required to fix pip download?

This issue and https://github.com/pypa/pip/issues/1884 are severely impacting Flatpak builds for Python apps. With the fact that now even some "core" Python packages have switched to using build dependencies and there not being any solution at the moment, this is getting worse by the day sadly.

pradyunsg commented 3 months ago

Sadly, no. If there was one, it'd be reported here.