rocketDuck / folivora

Make package version hunting easy -- Waiting for a revival, time should be cheaper ;)
Other
20 stars 2 forks source link

Be lenient with package names #7

Closed jaap3 closed 12 years ago

jaap3 commented 12 years ago

Currently, if you include a package like django-compressor it can not be found. This is because the actual name is django_compressor.

However, setuptools (easy_install) can still find the package even is it's written with a hyphen instead of an underscore. PyPi also fixes the typo by just redirecting: http://pypi.python.org/simple/django-compressor/ to http://pypi.python.org/simple/django_compressor/

apollo13 commented 12 years ago

Sounds sensible, although we probably should display a warning and tell users to use the actual package names ("Django" vs "django" is also a great example). Will have to check out the redirect rules of pypi I think.

jaap3 commented 12 years ago

Yes, I was trying to find some documentation about how pypi handles these thing but haven't found anything yet.

jaap3 commented 12 years ago

The pkgtools lib has a real_name method: http://pkgtools.readthedocs.org/en/latest/pypi.html#pkgtools.pypi.real_name

It's implementation is rather funny. It does a request to pypi, following any redirects and extracts the real package name from the final url:

urllib2.Request('http://pypi.python.org/simple/{0}'.format(package_name))
return urllib2.urlopen(r, timeout=timeout).geturl().split('/')[-2]

https://github.com/rubik/pkgtools/blob/4212eb27a5ba77b37c43e06ca2db3540acbdfa0f/pkgtools/pypi.py#L28

apollo13 commented 12 years ago

@EnTeQuAk is working on normalizing the names, I'd rather not query pypi all the time ;)

EnTeQuAk commented 12 years ago

@jaap3 I am working on that, but just to clarify pkgtools.real_name basically just does the same as pkg_resources.safe_name(...).lower() so I prefer the offline version :-D

jaap3 commented 12 years ago

@EnTeQuAk That's not entirely true. For they hairy cases safe_name even does the wrong thing:

>>> pkg_resources.safe_name('django_compressor')
'django-compressor'  # wrong!
>>> pkg_resources.safe_name('django-compressor')
'django-compressor'  # wrong!

>>> pkgtools.pypi.real_name('django-compressor')
'django_compressor'  # correct!
>>> pkgtools.pypi.real_name('django_compressor')
'django_compressor'  # correct!

.lower()ing the safe_name seems counter productive to me as package names are case sensitive:

>>> xmlrpclib.Server('http://pypi.python.org/pypi/').package_releases('django')
[]
>>> xmlrpclib.Server('http://pypi.python.org/pypi/').package_releases('Django')
['1.4.1', '1.4', '1.3.3', '1.3.2', '1.3.1', '1.3', '1.2.7', '1.2.6', '1.2.5', '1.2.4', '1.2.3', '1.2.2', '1.2.1', '1.2', '1.1.4', '1.1.3', '1.1.2', '1.0.4']
>>> 
apollo13 commented 12 years ago

@jaap3: We will be storing both forms: The correct one as retrieved by list_packages(). For lookups from requirements.txt we will normalize to safe_name.lower() and compare against that.

This way we have the correct name for display and APIs and a normalized version which we use internally.

EnTeQuAk commented 12 years ago

@jaap3 yea, we are using safe_name only to do the database matching to lenient with broken requirements.txt files. We always store the original name for the PyPi communication.

Still, even though package names are case sensitive, take buildout recipes as an example. They even require a package name to be a valid ConfigParser key so we need the normalization to support multiple platforms.

jaap3 commented 12 years ago

That seems sane. I was just thinking of an alternative solution, but yours is more elegant :)

EnTeQuAk commented 12 years ago

Please review: https://github.com/rocketDuck/folivora/pull/11

EnTeQuAk commented 12 years ago

merged to master