meshy / pythonwheels

Adoption analysis of Python Wheels: https://pythonwheels.com/
BSD 2-Clause "Simplified" License
102 stars 26 forks source link

PyPI API removal of top_packages #94

Closed hugovk closed 6 years ago

hugovk commented 7 years ago

Download counts are being removed from PyPI and instead BigQuery needs to be used.

See https://github.com/pypa/warehouse/pull/2480 which removes the top_packages from the API.

Right now, PyPI is running from https://github.com/pypa/pypi-legacy but will be switching to https://github.com/pypa/warehouse soon. (Their milestones show they're 95% complete to launch, and 38% complete to shut down legacy PyPI.)

See https://github.com/badges/shields/issues/716 and https://github.com/zhmcclient/python-zhmcclient/pull/73 for some more info on BigQuery.


Perhaps in the short term Python Wheels could use a hardcoded list of the top 360 packages.

meshy commented 7 years ago

@hugovk that's valuable information, thank you very much for sharing this.

I'm not keen on hard-coding the top packages, as long as the current system works.

While I don't relish the idea of making this change, BigQuery does allow us to do some interesting things.

For example, looking at this query to fetch the most downloaded packages it looks as though we may be able to set a shorter date range (say, a year), to encourage legacy projects to fall faster. That's probably worth tackling at another point though. We can stick with totals for the moment.

hugovk commented 7 years ago

A shorter time range sounds good.

Attached is the output of running pypinfo --json --days 365 --limit 360 "" project > 365.json from the useful https://github.com/ofek/pypinfo.

365.json.zip

Here's how the top 10 looks:

⌂63% [hugo:~/github/pypinfo] thousands-seperator* ± pypinfo -th --days 365 --limit 10 "" project
project         download_count
--------------- --------------
simplejson      327,946,463
six             214,930,152
python-dateutil 152,089,489
setuptools      149,294,971
botocore        146,935,887
pip             140,216,305
requests        137,229,399
pyasn1          134,867,638
docutils        126,916,467
jmespath        117,212,884
meshy commented 7 years ago

pypinfo looks fantastic -- good find, thanks.