pypa / setuptools

Official project repository for the Setuptools build system
https://pypi.org/project/setuptools/
MIT License
2.36k stars 1.15k forks source link

setuptools pkg_resources fails to detect version properly #419

Closed ghost closed 8 years ago

ghost commented 8 years ago

Originally reported by: Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL)


When installing a package:

  File "/home/larsoner/.local/lib/python2.7/site-packages/setuptools/dist.py", line 268, in __init__
    self.fetch_build_eggs(attrs['setup_requires'])
  File "/home/larsoner/.local/lib/python2.7/site-packages/setuptools/dist.py", line 313, in fetch_build_eggs
    replace_conflicting=True,
  File "/home/larsoner/.local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 836, in resolve
    dist = best[req.key] = env.best_match(req, ws, installer)
  File "/home/larsoner/.local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1074, in best_match
    dist = working_set.find(req)
  File "/home/larsoner/.local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 711, in find
    raise VersionConflict(dist, req)
pkg_resources.VersionConflict: (numpy 1.11.0.dev0-2329eae (/home/larsoner/.local/lib/python2.7/site-packages), Requirement.parse('numpy>=1.6.1'))

In other words, my numpy 1.11.0.dev0 is not seen as satisfying a >= 1.6.1 version requirement. It seems like using disutils.version.LooseVersion or so should solve this. I'm happy to look into making a PR if this seems like the right solution.


ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Certainly. I was only awaiting confirmation on this fix.

ghost commented 8 years ago

Original comment by reinout (Bitbucket: reinout, GitHub: reinout):


I checked it and yes, it works like a charm. Thanks for the quick fix!

In the mean time I also got some more error reports like this from colleagues, so would a 18.7.2 be possible?

ghost commented 8 years ago

Original comment by reinout (Bitbucket: reinout, GitHub: reinout):


I'll look at it tomorrow. I had an accident yesterday and spend a day quietly at home :-)

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


The more I think about the situation that @reinout has encountered, the more I think setuptools is doing the right thing here, because there is a distutils package there that is presumed to have a .egg-info directory, but that directory is improperly configured. In that case, raising an exception seems like the right thing to do.

I do recognize that this new error is a regression from the 18.7 and 18.7.1 behavior, which did have an explicit check for 'isfile'. I've updated the FileMetadata class to perform this check in has_metadata, which seems right to me.

@reinout, can you test against the latest commit and confirm that this fix addresses the issue?

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Avoid errors when metadata directory is broken. Ref #419.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Ahh, cool didn't notice the change to ifilter. Makes sense to be complete until a performance issue arises I suppose. Does that mean that @reinout's issue is fixed now?

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


@Eric89GXL

First, I did update 'filter' to use ifilter on Python 2 to avoid unnecessary iteration in _version_from_file.

However, in investigating #469, I decided that there shouldn't be many distinct methods of parsing the metadata, which is why _reload_version now re-uses self._get_metadata. If it turns out there's value in parsing only the head of metadata, we can address that in the implementation for loading the metadata.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Although looking at the code, the opening of the file is a bit far from where it's used now, so I'm not actually sure how I'd get it back to using for line in fid i.e. without reading the entire file, so it might be better if Jason has a look.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


@reinout reading the contents of the file is actually not a fallback, it's a first choice, and using the filename is now the fallback. It should be more robust to dev version numbering which uses +. But a try/except seems like it should be okay.

I'll wait to hear back from Jason about if a try/except or some conditional introspection with os or os.path would be the way to go, and if for line in fid is a worthwhile revert to make given the speed consideration (which I hadn't thought of until you brought it up). Different repos have different ideas for how to do these things, and I don't know if setuptools prefers try/except or conditionals more.

FWIW the traceback actually points to what seems like the correct offending line:

https://bitbucket.org/pypa/setuptools/src/a3e72c93835f0fe392732dd3a26f2071d8fcfe82/pkg_resources/__init__.py?at=default&fileviewer=file-view-default#__init__.py-2024

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


I'm not sure about the performance problems, but IIRC only the distutils-created .egg-info files need to be checked. Hopefully there won't be too many of them. My guess is it would need to number in the thousands to matter, since e.g. my numpy-1.11.0.dev0_99583aa.egg-info file is only 2 kB, and the line it actually needs is on the second line.

Jason, I think that's one advantage of the approach that I originally had with for line in fid: instead of the final functionalized-version using filter: filter will construct a list (at least on Python 2.*) by going through all lines in the file instead of just the first two.

ghost commented 8 years ago

Original comment by reinout (Bitbucket: reinout, GitHub: reinout):


Eric, could you add the fix? You know exactly where to look in the code, so... :-)

ghost commented 8 years ago

Original comment by reinout (Bitbucket: reinout, GitHub: reinout):


If I understand correctly, opening and reading the file is intended as a fall-back mechanism, right?

If so, a try... except IOError should not hurt. Perhaps even more liberal than IOError, but I'm always a bit anxious about being too free with exceptions :-)

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Yeah it looks like the fix causes this problem, because it added looking at the file for more information. I guess the offending line(s) could be put in a try/except, or some smarter checking of whether or not the file can actually be opened could be done instead.

ghost commented 8 years ago

Original comment by reinout (Bitbucket: reinout, GitHub: reinout):


If I read it correctly, this feature was added between 18.7 and 18.7.1, right? I'm seeing an error (on a colleague's laptop) related to numpy and it smells like the 18.7.1 fix could be the cause.

It is probably a corner case. And it probably doesn't even have anything to do with numpy specifically, I suspect that is only a funny coincidence :-)

It occurs when I run buildout's bootstrap script (the very latest), which builds a setuptools egg in the tmp directory. The corner case is that there is a /usr/lib/python2.7/dist-packages/numpy-1.6.1.egg-info symlink to ../...../pyshared/numpy-1.6.1.egg-info that is broken. That is, the pyshared/ version is gone. What I then get with 18.7.1:

vagrant@vagrant-base-precise-amd64:/vagrant$ python buildout/bootstrap.py --setuptools-version=18.7.1
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.7.1.zip
Extracting in /tmp/tmp2uJfKI
Now working in /tmp/tmp2uJfKI/setuptools-18.7.1
Building a Setuptools egg in /tmp/bootstrap-MpCmD8
Traceback (most recent call last):
  File "setup.py", line 21, in <module>
    exec(init_file.read(), command_ns)
  File "<string>", line 11, in <module>
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/setuptools/__init__.py", line 12, in <module>
    from setuptools.extension import Extension
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/setuptools/extension.py", line 8, in <module>
    from .dist import _get_unpatched
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/setuptools/dist.py", line 19, in <module>
    import pkg_resources
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 3138, in <module>
    @_call_aside
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 3124, in _call_aside
    f(*args, **kwargs)
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 3151, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 652, in _build_master
    ws = cls()
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 645, in __init__
    self.add_entry(entry)
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 701, in add_entry
    for dist in find_distributions(entry, True):
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2139, in find_on_path
    path_item, entry, metadata, precedence=DEVELOP_DIST
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2521, in from_location
    py_version=py_version, platform=platform, **kw
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2835, in _reload_version
    md_version = _version_from_file(self._get_metadata(self.PKG_INFO))
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2486, in _version_from_file
    line = next(iter(version_lines), '')
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2654, in _get_metadata
    for line in self.get_metadata_lines(name):
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2030, in get_metadata_lines
    return yield_lines(self.get_metadata(name))
  File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2024, in get_metadata
    with io.open(self.path, 'rU', encoding='utf-8') as f:
IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/dist-packages/numpy-1.6.1.egg-info'
/tmp/bootstrap-MpCmD8/setuptools-18.7.1-py2.7.egg
Traceback (most recent call last):
  File "buildout/bootstrap.py", line 117, in <module>
    ez['use_setuptools'](**setup_args)
  File "<string>", line 170, in use_setuptools
  File "<string>", line 122, in _do_download
  File "<string>", line 69, in _build_egg
IOError: Could not build the egg.

It works just fine with 18.5 and 18.7:

vagrant@vagrant-base-precise-amd64:/vagrant$ python buildout/bootstrap.py --setuptools-version=18.5
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.5.zip
Extracting in /tmp/tmp3UfjqI
Now working in /tmp/tmp3UfjqI/setuptools-18.5
Building a Setuptools egg in /tmp/bootstrap-HL_EHS
/tmp/bootstrap-HL_EHS/setuptools-18.5-py2.7.egg
Setting socket time out to 1 seconds.
Creating directory '/vagrant/develop-eggs'.
Generated script '/vagrant/bin/buildout'.
vagrant@vagrant-base-precise-amd64:/vagrant$ python buildout/bootstrap.py --setuptools-version=18.7
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.7.zip
Extracting in /tmp/tmpH_PWEX
Now working in /tmp/tmpH_PWEX/setuptools-18.7
Building a Setuptools egg in /tmp/bootstrap-aNmdvG
/tmp/bootstrap-aNmdvG/setuptools-18.7-py2.7.egg
Setting socket time out to 1 seconds.
Creating directory '/vagrant/develop-eggs'.
Generated script '/vagrant/bin/buildout'.

Symlinks that are broken are tricky in python. They exist, but cannot be opened. I've had problems with them, too.

Two questions:

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Cool, makes sense to me! Thanks for working with me on this issue, too. Now I just need to work on convincing the numpy/scipy folks to go with setuptools full time :)

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Well, I prefer that style personally. It's not a pypa preference, AFAIK. The reason I prefer it is because it's more functional in style and more robust from an analysis point of view. I strive to keep code as flat as possible, as each indented block creates a logical branch, each branch which needs to be analyzed/tested. For example, it wasn't clear to me originally that a return of None when no Version was found was important. Using a functional style means the return value is clearer.

I do see, though, how the approach might strike some as convoluted, but in my experience, formulating code in that functional programming way is more robust, and worth recognizing as a useful pattern.

Ideally, I'd probably re-write it again to (a) have key/value pairs returned by the iterator, (b) filter on the key, and (c) construct a dictionary whose .get result is the desired value.

Good feedback, though.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Yep, works over here.

A point of curiosity from my end -- IMO the subsequent changes to _version_from_file made it a bit less easily understood than the iteration over lines with if comparison. Was that a stylistic preference (e.g., fitting better with pypa style to use filter/iter/next), or did it fix some bug with the code? I noticed it took a few commits to get that version working properly, so I'm curious if the original version had some missing functionality or disadvantages.

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


@Eric89GXL Can you please test with the latest code and confirm it behaves as you expect?

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Update changelog. Fixes #419.

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Add tests ref #419.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Done, I think -- https://bitbucket.org/pypa/setuptools/pull-requests/144/fix-for-issue-419-read-egg-info-files-to/diff

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Sounds great. I look forward to it.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


I have a limited knowledge of setuptools and Python packaging, but I'd expect the content of the file to be more reliable than what the file itself is named, for various reasons (e.g., OSs put restrictions on filenames that will not exist on the contents of files). So to me it seems like a more principled solution to read the file contents than rely on the filename itself.

I could open a PR to open and read the file, perhaps with a try/except to fall back to using the filename itself in cases of error. Does that sound good?

ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


I don't understand the system well enough to say what the implications might be to reading out the file. I'd be open to someone exploring that technique. If the implementation is straightforward and works well enough in a few isolated environments, I'd be willing to push it out to the larger community (as a beta) for comment.

ghost commented 8 years ago

Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):


Getting the version number from the file is something @jaraco would need to comment on :D

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Wouldn't it be more robust to parse the file itself for the version information? Or is that expected to do some bad things / be unreliable / be slow?

In any case, I've opened an issue for numpy to discuss using setuptools:

https://github.com/numpy/numpy/issues/6257

It looks like back in 2013 it was tried and reverted, due to setuptools changing the test script permissions:

https://github.com/scipy/scipy/commit/a4e93fbd1c1e192b699e1ff545236dab7cffc579#diff-2eeaed663bd0d25b7e608891384b7298L185

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Or perhaps setuptools could actually read the version information from within the file instead of just using the filename...?

ghost commented 8 years ago

Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):


We considered doing that in the PEP, however the problem was a version like 1.0-1 is valid as an alternative spelling of 1.0.post1 (1.0-1 was supported in setuptools previously and had use on PyPI). When you create a wheel the - character is munged to _ because wheel uses - to separate different parts of the filename so that munging makes a - in the filename unambiguously the separator in wheel, but it means that we need to consider _ equivalent to -. So 1.0-1 and 1.0_1 are both already in use to mean something in particular. We toyed with the idea of making it so that 1.0-<anything that's not an integer> (and thus 1.0_<anything that's not an integer>) as an alias for 1.0+<anything that's not an integer>, however we determined that it was far too liberal and too magical, making strange things that are obviously not versions into "valid" versions.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Turns out numpy and scipy use distutils to do some of their building, including the part where the filename is constructed. I'll bother them about this issue, hopefully they'll have some solution at their end. But given the lengths they've gone to in order to use distutils instead of setuptools in their setup script, I'm not too optimistic. Is changing the parsing of _ in filenames in setuptools off the table?

ghost commented 8 years ago

Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):


It works when installed too: https://caremad.io/s/SsluEIrRRO/

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


I figured that the filename was being changed to be safe... I'll try to track down who is actually changing the filename, but I suspect it is their desired (and not unreasonable) behavior.

I'm not sure I quite follow your example, though. The problem I'm having here is that the egg-info filename is being parsed for the version, the version in that filename has replaced + with _, and then the _ is being changed to - instead of +. In your example, your filename doesn't have a + in it. Your egg info does have the + in the version within the file, but then again so does mine:

larsoner@bunk:~/python/scitran$ cat ~/.local/lib/python2.7/site-packages/numpy-1.11.0.dev0_2329eae.egg-info 
Metadata-Version: 1.1
Name: numpy
Version: 1.11.0.dev0+2329eae
...

But that is a bit immaterial I think, since that file isn't actually parsed in pkg_resources (just the filename is), at least in the debugging I did. Did you expect that the version actually in that file would come into play?

FWIW I'm currently running 18.2 (most recent release) of setuptools, but it's possible somebody is using an older version or something.

ghost commented 8 years ago

Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):


Yes (see https://caremad.io/s/IaPtgPLQVE/). I think 8.0 might have been when PEP 440 was implemented in setuptools (and thus when + started to be supported in versions) but my memory might be off.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


You mean older versions of setuptools replace the + with a _ in the filename...?

ghost commented 8 years ago

Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):


It sounds like somewhere along the lines someone is using an older version of setuptools. Only older versions mangle the + in that way:

>>> import pkg_resources._vendor.packaging as packaging
>>> packaging.version.parse('1.11.0.dev0+2329eae')
<Version('1.11.0.dev0+2329eae')>
>>> import pkg_resources
>>> pkg_resources.safe_version('1.11.0.dev0+2329eae')
'1.11.0.dev0+2329eae'
ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Or alternatively, Version could be improved to handle the _ being in place of + in the filename somehow.

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


So I looked, and numpy correctly sets its numpy.__version__ using +. The problem is that the numpy .egg_info is written out to be safe as 1.11.0.dev0_2329eae, i.e. the + has been replaced by _. Eventually (down a bit of a rabbit hole) in the pkg_resources module, the version extracted from this filename gets put through safe_version, which replaces the _ with a -, thus incorrectly reconstructing the module name. Is there a better/safer way to parse the version number, or could safe_version be improved to deal with this case somehow?

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


...maybe not NumPy's problem

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Fixing in numpy

ghost commented 8 years ago

Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):


Great, I'll work with the numpy folks to update it. Thanks!

ghost commented 8 years ago

Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):


Use a + instead of a -.

>>> import pkg_resources._vendor.packaging as packaging
>>> packaging.version.parse('1.11.0.dev0+2329eae')
<Version('1.11.0.dev0+2329eae')>
ghost commented 8 years ago

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


The issue is that with PEP 440, your local version is parsed as a LegacyVersion:

>>> import pkg_resources.packaging as packaging
>>> packaging.version.parse('1.11.0.dev0-2329eae')
<LegacyVersion('1.11.0.dev0-2329eae')>

And all LegacyVersions compare less than a proper Version of 0.

It's actually the post release tag with the dash in it that's forcing the LegacyVersion. If you can omit that tag, then the version will parse as a proper Version and the requirement will be satisfied.

@dstufft: Do you have any recommendations for this issue?