Closed ghost closed 8 years ago
Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):
The more I think about the situation that @reinout has encountered, the more I think setuptools is doing the right thing here, because there is a distutils package there that is presumed to have a .egg-info directory, but that directory is improperly configured. In that case, raising an exception seems like the right thing to do.
I do recognize that this new error is a regression from the 18.7 and 18.7.1 behavior, which did have an explicit check for 'isfile'. I've updated the FileMetadata class to perform this check in has_metadata, which seems right to me.
@reinout, can you test against the latest commit and confirm that this fix addresses the issue?
Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):
@Eric89GXL
First, I did update 'filter' to use ifilter on Python 2 to avoid unnecessary iteration in _version_from_file
.
However, in investigating #469, I decided that there shouldn't be many distinct methods of parsing the metadata, which is why _reload_version
now re-uses self._get_metadata
. If it turns out there's value in parsing only the head of metadata, we can address that in the implementation for loading the metadata.
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
Although looking at the code, the opening of the file is a bit far from where it's used now, so I'm not actually sure how I'd get it back to using for line in fid
i.e. without reading the entire file, so it might be better if Jason has a look.
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
@reinout reading the contents of the file is actually not a fallback, it's a first choice, and using the filename is now the fallback. It should be more robust to dev
version numbering which uses +
. But a try/except
seems like it should be okay.
I'll wait to hear back from Jason about if a try/except
or some conditional introspection with os
or os.path
would be the way to go, and if for line in fid
is a worthwhile revert to make given the speed consideration (which I hadn't thought of until you brought it up). Different repos have different ideas for how to do these things, and I don't know if setuptools
prefers try/except
or conditionals more.
FWIW the traceback actually points to what seems like the correct offending line:
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
I'm not sure about the performance problems, but IIRC only the distutils
-created .egg-info
files need to be checked. Hopefully there won't be too many of them. My guess is it would need to number in the thousands to matter, since e.g. my numpy-1.11.0.dev0_99583aa.egg-info
file is only 2 kB, and the line it actually needs is on the second line.
Jason, I think that's one advantage of the approach that I originally had with for line in fid:
instead of the final functionalized-version using filter
: filter
will construct a list (at least on Python 2.*) by going through all lines in the file instead of just the first two.
Original comment by reinout (Bitbucket: reinout, GitHub: reinout):
If I understand correctly, opening and reading the file is intended as a fall-back mechanism, right?
If so, a try... except IOError
should not hurt. Perhaps even more liberal than IOError
, but I'm always a bit anxious about being too free with exceptions :-)
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
Yeah it looks like the fix causes this problem, because it added looking at the file for more information. I guess the offending line(s) could be put in a try/except
, or some smarter checking of whether or not the file can actually be opened could be done instead.
Original comment by reinout (Bitbucket: reinout, GitHub: reinout):
If I read it correctly, this feature was added between 18.7 and 18.7.1, right? I'm seeing an error (on a colleague's laptop) related to numpy and it smells like the 18.7.1 fix could be the cause.
It is probably a corner case. And it probably doesn't even have anything to do with numpy specifically, I suspect that is only a funny coincidence :-)
It occurs when I run buildout's bootstrap script (the very latest), which builds a setuptools egg in the tmp directory. The corner case is that there is a /usr/lib/python2.7/dist-packages/numpy-1.6.1.egg-info
symlink to ../...../pyshared/numpy-1.6.1.egg-info
that is broken. That is, the pyshared/
version is gone. What I then get with 18.7.1:
vagrant@vagrant-base-precise-amd64:/vagrant$ python buildout/bootstrap.py --setuptools-version=18.7.1
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.7.1.zip
Extracting in /tmp/tmp2uJfKI
Now working in /tmp/tmp2uJfKI/setuptools-18.7.1
Building a Setuptools egg in /tmp/bootstrap-MpCmD8
Traceback (most recent call last):
File "setup.py", line 21, in <module>
exec(init_file.read(), command_ns)
File "<string>", line 11, in <module>
File "/tmp/tmp2uJfKI/setuptools-18.7.1/setuptools/__init__.py", line 12, in <module>
from setuptools.extension import Extension
File "/tmp/tmp2uJfKI/setuptools-18.7.1/setuptools/extension.py", line 8, in <module>
from .dist import _get_unpatched
File "/tmp/tmp2uJfKI/setuptools-18.7.1/setuptools/dist.py", line 19, in <module>
import pkg_resources
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 3138, in <module>
@_call_aside
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 3124, in _call_aside
f(*args, **kwargs)
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 3151, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 652, in _build_master
ws = cls()
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 645, in __init__
self.add_entry(entry)
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 701, in add_entry
for dist in find_distributions(entry, True):
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2139, in find_on_path
path_item, entry, metadata, precedence=DEVELOP_DIST
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2521, in from_location
py_version=py_version, platform=platform, **kw
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2835, in _reload_version
md_version = _version_from_file(self._get_metadata(self.PKG_INFO))
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2486, in _version_from_file
line = next(iter(version_lines), '')
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2654, in _get_metadata
for line in self.get_metadata_lines(name):
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2030, in get_metadata_lines
return yield_lines(self.get_metadata(name))
File "/tmp/tmp2uJfKI/setuptools-18.7.1/pkg_resources/__init__.py", line 2024, in get_metadata
with io.open(self.path, 'rU', encoding='utf-8') as f:
IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/dist-packages/numpy-1.6.1.egg-info'
/tmp/bootstrap-MpCmD8/setuptools-18.7.1-py2.7.egg
Traceback (most recent call last):
File "buildout/bootstrap.py", line 117, in <module>
ez['use_setuptools'](**setup_args)
File "<string>", line 170, in use_setuptools
File "<string>", line 122, in _do_download
File "<string>", line 69, in _build_egg
IOError: Could not build the egg.
It works just fine with 18.5 and 18.7:
vagrant@vagrant-base-precise-amd64:/vagrant$ python buildout/bootstrap.py --setuptools-version=18.5
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.5.zip
Extracting in /tmp/tmp3UfjqI
Now working in /tmp/tmp3UfjqI/setuptools-18.5
Building a Setuptools egg in /tmp/bootstrap-HL_EHS
/tmp/bootstrap-HL_EHS/setuptools-18.5-py2.7.egg
Setting socket time out to 1 seconds.
Creating directory '/vagrant/develop-eggs'.
Generated script '/vagrant/bin/buildout'.
vagrant@vagrant-base-precise-amd64:/vagrant$ python buildout/bootstrap.py --setuptools-version=18.7
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.7.zip
Extracting in /tmp/tmpH_PWEX
Now working in /tmp/tmpH_PWEX/setuptools-18.7
Building a Setuptools egg in /tmp/bootstrap-aNmdvG
/tmp/bootstrap-aNmdvG/setuptools-18.7-py2.7.egg
Setting socket time out to 1 seconds.
Creating directory '/vagrant/develop-eggs'.
Generated script '/vagrant/bin/buildout'.
Symlinks that are broken are tricky in python. They exist, but cannot be opened. I've had problems with them, too.
Two questions:
sys.path
being scanned? So: could this fix lead to performance problems? Might not be the case, but it probably can't hurt to take a quick look.Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):
Well, I prefer that style personally. It's not a pypa preference, AFAIK. The reason I prefer it is because it's more functional in style and more robust from an analysis point of view. I strive to keep code as flat as possible, as each indented block creates a logical branch, each branch which needs to be analyzed/tested. For example, it wasn't clear to me originally that a return of None when no Version was found was important. Using a functional style means the return value is clearer.
I do see, though, how the approach might strike some as convoluted, but in my experience, formulating code in that functional programming way is more robust, and worth recognizing as a useful pattern.
Ideally, I'd probably re-write it again to (a) have key/value pairs returned by the iterator, (b) filter on the key, and (c) construct a dictionary whose .get result is the desired value.
Good feedback, though.
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
Yep, works over here.
A point of curiosity from my end -- IMO the subsequent changes to _version_from_file
made it a bit less easily understood than the iteration over lines with if
comparison. Was that a stylistic preference (e.g., fitting better with pypa
style to use filter
/iter
/next
), or did it fix some bug with the code? I noticed it took a few commits to get that version working properly, so I'm curious if the original version had some missing functionality or disadvantages.
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
Done, I think -- https://bitbucket.org/pypa/setuptools/pull-requests/144/fix-for-issue-419-read-egg-info-files-to/diff
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
I have a limited knowledge of setuptools
and Python packaging, but I'd expect the content of the file to be more reliable than what the file itself is named, for various reasons (e.g., OSs put restrictions on filenames that will not exist on the contents of files). So to me it seems like a more principled solution to read the file contents than rely on the filename itself.
I could open a PR to open and read the file, perhaps with a try/except
to fall back to using the filename itself in cases of error. Does that sound good?
Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):
I don't understand the system well enough to say what the implications might be to reading out the file. I'd be open to someone exploring that technique. If the implementation is straightforward and works well enough in a few isolated environments, I'd be willing to push it out to the larger community (as a beta) for comment.
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
Wouldn't it be more robust to parse the file itself for the version information? Or is that expected to do some bad things / be unreliable / be slow?
In any case, I've opened an issue for numpy
to discuss using setuptools
:
https://github.com/numpy/numpy/issues/6257
It looks like back in 2013 it was tried and reverted, due to setuptools
changing the test script permissions:
Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):
We considered doing that in the PEP, however the problem was a version like 1.0-1 is valid as an alternative spelling of 1.0.post1 (1.0-1 was supported in setuptools previously and had use on PyPI). When you create a wheel the -
character is munged to _
because wheel uses -
to separate different parts of the filename so that munging makes a -
in the filename unambiguously the separator in wheel, but it means that we need to consider _
equivalent to -
. So 1.0-1
and 1.0_1
are both already in use to mean something in particular. We toyed with the idea of making it so that 1.0-<anything that's not an integer>
(and thus 1.0_<anything that's not an integer>
) as an alias for 1.0+<anything that's not an integer>
, however we determined that it was far too liberal and too magical, making strange things that are obviously not versions into "valid" versions.
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
Turns out numpy
and scipy
use distutils
to do some of their building, including the part where the filename is constructed. I'll bother them about this issue, hopefully they'll have some solution at their end. But given the lengths they've gone to in order to use distutils
instead of setuptools
in their setup script, I'm not too optimistic. Is changing the parsing of _
in filenames in setuptools
off the table?
Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):
It works when installed too: https://caremad.io/s/SsluEIrRRO/
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
I figured that the filename was being changed to be safe... I'll try to track down who is actually changing the filename, but I suspect it is their desired (and not unreasonable) behavior.
I'm not sure I quite follow your example, though. The problem I'm having here is that the egg-info
filename is being parsed for the version, the version in that filename has replaced +
with _
, and then the _
is being changed to -
instead of +
. In your example, your filename doesn't have a + in it. Your egg info does have the +
in the version within the file, but then again so does mine:
larsoner@bunk:~/python/scitran$ cat ~/.local/lib/python2.7/site-packages/numpy-1.11.0.dev0_2329eae.egg-info
Metadata-Version: 1.1
Name: numpy
Version: 1.11.0.dev0+2329eae
...
But that is a bit immaterial I think, since that file isn't actually parsed in pkg_resources
(just the filename is), at least in the debugging I did. Did you expect that the version actually in that file would come into play?
FWIW I'm currently running 18.2
(most recent release) of setuptools
, but it's possible somebody is using an older version or something.
Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):
Yes (see https://caremad.io/s/IaPtgPLQVE/). I think 8.0 might have been when PEP 440 was implemented in setuptools (and thus when +
started to be supported in versions) but my memory might be off.
Original comment by dstufft (Bitbucket: dstufft, GitHub: dstufft):
It sounds like somewhere along the lines someone is using an older version of setuptools. Only older versions mangle the +
in that way:
>>> import pkg_resources._vendor.packaging as packaging
>>> packaging.version.parse('1.11.0.dev0+2329eae')
<Version('1.11.0.dev0+2329eae')>
>>> import pkg_resources
>>> pkg_resources.safe_version('1.11.0.dev0+2329eae')
'1.11.0.dev0+2329eae'
Original comment by Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL):
So I looked, and numpy
correctly sets its numpy.__version__
using +
. The problem is that the numpy
.egg_info
is written out to be safe as 1.11.0.dev0_2329eae
, i.e. the +
has been replaced by _
. Eventually (down a bit of a rabbit hole) in the pkg_resources
module, the version extracted from this filename gets put through safe_version
, which replaces the _
with a -
, thus incorrectly reconstructing the module name. Is there a better/safer way to parse the version number, or could safe_version
be improved to deal with this case somehow?
Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):
The issue is that with PEP 440, your local version is parsed as a LegacyVersion:
>>> import pkg_resources.packaging as packaging
>>> packaging.version.parse('1.11.0.dev0-2329eae')
<LegacyVersion('1.11.0.dev0-2329eae')>
And all LegacyVersions compare less than a proper Version of 0.
It's actually the post release tag with the dash in it that's forcing the LegacyVersion. If you can omit that tag, then the version will parse as a proper Version and the requirement will be satisfied.
@dstufft: Do you have any recommendations for this issue?
Originally reported by: Eric89GXL (Bitbucket: Eric89GXL, GitHub: Eric89GXL)
When installing a package:
In other words, my numpy 1.11.0.dev0 is not seen as satisfying a >= 1.6.1 version requirement. It seems like using
disutils.version.LooseVersion
or so should solve this. I'm happy to look into making a PR if this seems like the right solution.