pypa / setuptools

Official project repository for the Setuptools build system
https://pypi.org/project/setuptools/
MIT License
2.36k stars 1.15k forks source link

build_meta doesn't use a fresh dist directory, which causes ValueError unpacking tuple #1671

Closed pganssle closed 5 years ago

pganssle commented 5 years ago

If you use pip install . on a source directory that already has a dist, the existing directory structure is used when preparing the wheel. This causes an exception if an existing wheel with a different name exists, because build_meta assumes there's only one wheel in dist.

Here's a script to create a MWE repo:

#!/usr/bin/bash

mkdir /tmp/demo_dist_517
cd /tmp/demo_dist_517
echo "from setuptools import setup; setup()" > setup.py
echo "0.0.1" > VERSION
cat > setup.cfg << EOF
[metadata]
name = foo
version = file: VERSION
EOF
cat > pyproject.toml << EOF
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
EOF

At this point your repo looks like this:

$ tree
├── pyproject.toml
├── setup.cfg
├── setup.py
└── VERSION

Create a wheel in dist, then change the version:

pip wheel . --no-deps -w dist
echo "0.0.2" > VERSION

Now try to create a wheel from the repo:

pip wheel . -w dist

This will trigger an error in build_meta: File "/tmp/pip-build-env-plomixa1/overlay/.../setuptools/build_meta.py", line 157, in _file_with_extension file, = matching ValueError: too many values to unpack (expected 1) Building wheel for foo (PEP 517) ... error Failed building wheel for foo

This is pretty easy to work around, obviously, since the user can just remove dist before invoking any pip commands, but it might be better to do the wheel build in a clean directory if possible.

pganssle commented 5 years ago

I could be wrong, but I think we can somewhat trivially fix this by changing build_meta.build_wheel to directly use --dist-dir=wheel_directory, so:

    sys.argv = sys.argv[:1] + ['bdist_wheel'] + \
        config_settings["--global-option"]
    _run_setup()
    if wheel_directory != 'dist':
        shutil.rmtree(wheel_directory)
        shutil.copytree('dist', wheel_directory)

Would turn to:

    sys.argv = sys.argv[:1] + ['bdist_wheel', '--dist-dir=%s' % wheel_directory] + \
        config_settings["--global-option"]
    _run_setup()

I'm not sure what happens if --dist-dir is specified twice, so maybe we want to detect if --dist-dir or -d is already in config_settings['--global-option'] and if so use the old behavior. Though that might not matter, since right now using --dist-dir will fail in some other weird way.

pradyunsg commented 5 years ago

FYI https://github.com/pypa/pip/commit/51086b371808c3ce093ef28a317df7b7880efffa

pganssle commented 5 years ago

@pradyunsg Is this causing major problems? I put it in my mental backlog and left as a possible sprint item just because it seemed like it would only affect maintainers who leave build artifacts around in their source directories. If it's causing real problems for other people, I'll at least bump it up in priority from "minor".

pradyunsg commented 5 years ago

Nope -- this is pretty minor IMO.

I wanted to note here that this showed up while testing pip wheel when that broke due to PEP 517 changes on pip's side. Should've been less terse, sorry!

florisla commented 5 years ago

Hi, can the title of this issue be changed to include "ValueError: too many values to unpack" ? That would make it easier to Google when you encounter the bug -- like I did recently.

This bug is easily triggered in this (not very far-fetched) scenario:

pganssle commented 5 years ago

Not sure we need to change the title of the issue, that's a fairly generic title for the issue. I've changed the issue title to mention ValueError in case that helps.

Hopefully this won't persist very long as it's a fairly easy issue to fix, the hardest part of the issue will be writing a test for it, but even that shouldn't be too tough since it just requires translating my reproduction steps into Python.

florisla commented 5 years ago

If it can help, I've implemented a test here: https://github.com/florisla/setuptools/commit/3c3c6e568feaad654efda4bfe1275643f0f4872d .

It produces 'ValueError: too many values to unpack' as expected.

pganssle commented 5 years ago

@florisla Yes that would be very useful. Can you mark it as xfail and make a PR?

The name can also be shortened pretty considerably - the build_meta tests all assume a pyproject.toml exists, since build_meta is our PEP 517 backend.

ariciputi commented 5 years ago

Hi, I have a slightly different use case that seems to trigger the same issue reported here.

I'm using pipenv with a Python project of mine. The project is structured with the src/-based layout, so that in order to easily run the tests I pipenv install -e . it.

Upon release I want to create the wheels for my application and all its dependencies in order to push them on a local PyPI repo. To do that pip wheel --wheel-dir dist --requirement requirements.txt is issued, where the requirements.txt file is automatically generated via pipenv lock --requirements.

Because of the pipenv install -e . the requirements.txt file includes also a -e . entry, and I think this is the reason why I get the error reported here.

Is this a valid use case or I'm just doing it wrong?

Thanks.

pganssle commented 5 years ago

@ariciputi This is definitely a bug in setuptools, but it seems you can work around it pretty easily with pip wheel --wheel-dir wheels --requirement requirements.txt.

ariciputi commented 5 years ago

@pganssle thanks for the hint.

webknjaz commented 5 years ago

@pganssle FTR this is affecting building a bunch of manylinux1 wheels since OS-specific wheels require the same source build against multiple Pythons. All current recommendations share a script looping all pythons with pip wheel...

webknjaz commented 5 years ago

@pganssle I've spent almost an entire day today with this and turns out that your advice from https://github.com/pypa/setuptools/issues/1671#issuecomment-482754164 doesn't exactly work in case if there was ./dist pre-existing.

So the problem is that it's hard-coded.

In my manylinux1 generation scripts I tried smth like pip wheel --no-deps . -w /tmp/orig_dists.xxxx/py35 to have dedicated dist dir for each invocation per-python.

Then, I auditwheel them.

Finally, I copy stuff back to dist/.

The problem was that I was running two containers (x86_64 and i686). So the first one creates ./dist and the second one picks it up and magically fails with this bug...

Now I have ideas on how to work around this properly but I think that this issue is important enough to be solved ASAP because I think that may manylinux1 wheels maintainers will hit it sooner or later. And it's quite tricky to debug.

The bug originates here (in particular): https://github.com/pypa/setuptools/blob/64e60fc32981a1615c35962a60297d264bf16734/setuptools/build_meta.py#L168-L170

pganssle commented 5 years ago

@webknjaz Yes, I believe I explained how to solve this issue in this comment, that's why it's tagged "good first issue".

You can work around it by never using dist/ in your own workflow, which seems pretty easy to do. Feel free to submit a PR, I don't have time to fix it myself before PyCon, and I figured I would try to entice someone to fix it during the sprints, but there's plenty of stuff to fix around here that we don't need to be hoarding "easy" issues.

Most of the work of the PR would be properly designing a test.

webknjaz commented 5 years ago

Yeah... It was just not immediately obvious to me why it was still failing after changing -w and lead in a misleading direction. That's why I thought I'd better document my problem for anyone else facing it :)

shashanksingh28 commented 5 years ago

I had a look at this and as @pganssle mentioned, the issue is that build_meta.build_wheel (and also build_meta.build_sdist) assume that there will only be 1 .whl or .tar.gzin the wheel or sdist directory.

If we agree that a user could possibly want to populate a dist or equivalent directory to hold multiple .whl versions, then the fix is slightly more complex.

self.run_setup() is the method that creates the wheel / tar on the directory that was provided to it. If this method could return the path of the final .whl or .tar.gz that was built as a result of its invocation, it would be trivial to just return that and not use the whole _file_with_extension method with its one file assumption. However, I am sure it is non-trivial (or a much more involved discussion) because it is essentially running a user defined setup.py script.

That leaves us with an option of heuristically determining, amongst a list of possible .whl files, the one that might have been the result of the current setup invocation, which again, I am not sure is the ideal solution.

I don't think this would work since the issue is that the build_directory provided by the caller could have existing things and the problem here is determining what in that dir was just built. But I could be wrong given my limited knowledge. So we could fall back to the original idea in the main issue:

This is pretty easy to work around, obviously, since the user can just remove dist before invoking any pip commands, but it might be better to do the wheel build in a clean directory if possible.

Implement this by checking if the wheel_directory already has a .whl extension file, we create a fresh temp-dir, pass that to build_meta.build_wheel, find the 1 .whl file there, copy it to the given wheel_directory (and store this path), cleanup the temp-dir and return the stored path.

I could raise a PR to do this if it sounds sensible?

benoit-pierre commented 5 years ago

That's what I was thinking:

 setuptools/build_meta.py | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git i/setuptools/build_meta.py w/setuptools/build_meta.py
index 47cbcbf6..51113856 100644
--- i/setuptools/build_meta.py
+++ w/setuptools/build_meta.py
@@ -32,6 +32,7 @@
 import tokenize
 import shutil
 import contextlib
+import tempfile

 import setuptools
 import distutils
@@ -182,14 +183,15 @@ def build_wheel(self, wheel_directory, config_settings=None,
                     metadata_directory=None):
         config_settings = self._fix_config(config_settings)
         wheel_directory = os.path.abspath(wheel_directory)
-        sys.argv = sys.argv[:1] + ['bdist_wheel'] + \
-            config_settings["--global-option"]
-        self.run_setup()
-        if wheel_directory != 'dist':
-            shutil.rmtree(wheel_directory)
-            shutil.copytree('dist', wheel_directory)
-
-        return _file_with_extension(wheel_directory, '.whl')
+        with tempfile.TemporaryDirectory(dir=wheel_directory) as tmp_dist_dir:
+            sys.argv = sys.argv[:1] + [
+                'bdist_wheel', '--dist-dir', tmp_dist_dir
+            ] + config_settings["--global-option"]
+            self.run_setup()
+            wheel_basename = _file_with_extension(tmp_dist_dir, '.whl')
+            os.rename(os.path.join(tmp_dist_dir, wheel_basename),
+                      os.path.join(wheel_directory, wheel_basename))
+            return wheel_basename

     def build_sdist(self, sdist_directory, config_settings=None):
         config_settings = self._fix_config(config_settings)
pganssle commented 5 years ago

@shashanksingh28 No need to do anything quite so complicated, we can just use the temporary directory provided to us by the PEP 517 front-end, as in the solution I described in this comment.

The most difficult part of making this PR will be creating a test for it. One trivial way to do that is to do something like this test, but using build_wheel instead of build_sdist and populating files like this:

        files = {
            'setup.py': DALS("""
                __import__('setuptools').setup(
                    name='foo',
                    version='0.1.0',
                )"""),
        'dist/foo-0.0.0-py2.py3-none-any.whl'
        }

I'd add the test first to make sure it triggers the bug. Another option is to basically implement the MWE from the original post as a test, and run build_wheel once, change the version, then run it again. Feel free to add both tests.

pganssle commented 5 years ago

@benoit-pierre Why? I'm pretty sure wheel_directory is already a temporary directory created as part of the PEP 517 build.

I think it's fine to add a temporary directory on top of it, I guess. That will be helpful if someone configures their frontend to not build the wheels in a clean directory for whatever reason.

benoit-pierre commented 5 years ago

It's not explicitly mentioned in the PEP that it's a temporary directory, at least not in the build_wheel section.

pganssle commented 5 years ago

OK, I don't think there's any harm in creating our own temporary directory for this. In that case we'll probably also want to add a test (we can just parametrize the first test) for the situation where wheel_directory already has a wheel in it.

webknjaz commented 5 years ago

I like @benoit-pierre's proposal with a tmpdir. It's clean and easy to understand. One minor suggestion: move the return instruction outside of with-block.

pganssle commented 5 years ago

Er, hold up, I just remembered that #1726 already implements a test for this and was just blocked on a bug that has since been fixed. Let me merge that.

pganssle commented 5 years ago

1726 should merge as soon as CI passes. @shashanksingh28 Feel free to submit a PR based on it that:

  1. Adds the fix
  2. Adds a changelog entry
  3. Removes the xfail decorator.
shashanksingh28 commented 5 years ago

Just realized that the tempdir idea won't cut it.

Turns out run_setup() always creates the .whl in the 'dist' folder. So if the user provides a wheel_directory that is not dist, run_setup will put the latest .whl in the dist folder and then do a copytree as per this section. This will copy both .whl files to the wheel_directory and we again have the same problem.

A possible solution would be to get the latest .whl file generated in the dist dir?

Something like:

def _file_with_extension(directory, extension):
    "return the latest file with given extension in the directory"
    matching = (os.path.join(directory, f) for f in os.listdir(directory) if f.endswith(extension))
    matching = sorted(matching, key=os.path.getctime)
    return os.path.basename(matching[-1])

Is it necessary for a given wheel_directory to completely match dist? I believe if the caller is providing one they only want the current wheel to be there...

This approach will still have inconsistencies.. If one does : pip wheel . -w non_dist_dir then the /dist could contain multiple wheels but if they do pip wheel . -w dist or just pip wheel . then dist will always have only the latest wheel.

shashanksingh28 commented 5 years ago

I wonder if really the simplest solution is to ensure dist never has any .whl apart from the latest version :thinking:

benoit-pierre commented 5 years ago

Turns out run_setup() always creates the .whl in the 'dist' folder.

I'm not seeing any issue with my patch above, and the test pass.

benoit-pierre commented 5 years ago

run_setup() behave as if python setup.py was used, and you can test for yourself, python setup.py bdist_wheel -d dir work as expected. At least for me. I'm using wheel==0.33.1, what's your version?

shashanksingh28 commented 5 years ago

I'm not seeing any issue with my patch above, and the test pass.

Ah I see why. Because you explicitly pass the tempdir as the --dist-dir argument (as Paul suggested and hence you don't need to copy from the default dist dir as the code currently does). I was missing that in my implementation.

Thanks :+1:, will put something that is compatible with python2 as well that mimics your patch

webknjaz commented 5 years ago

@pganssle now that it's fixed. May I ask for a new bugfix release, please?

pganssle commented 5 years ago

@webknjaz I have no objections, but I won't have much time until this weekend at the earliest. @benoit-pierre or @jaraco feel free to cut a release.

benoit-pierre commented 5 years ago

I'll do it.

webknjaz commented 5 years ago

@benoit-pierre thank you!