ofek / pypinfo

Easily view PyPI download statistics via Google's BigQuery.
MIT License
417 stars 33 forks source link

Support for "last updated" dates by package #161

Open jaraco opened 2 months ago

jaraco commented 2 months ago

I know this project focuses on "downloads", but I wonder if the Bigtable also has info on when a package was last uploaded. I'd like to be able to query for all packages that were updated in a given time horizon in order to efficiently refresh a database mapping import names to packages.

Is that in the scope of this project? Is there a good way to explore the data schema to see what data might already be available?

Thanks for any advice.

hugovk commented 2 months ago

You can get this directly from the PyPI JSON API without needing to set up or use BigQuery quota:

https://pypi.org/pypi/norwegianblue/json

For example:

>>> import json, requests
>>> package = "norwegianblue"
>>> url = f"https://pypi.org/pypi/{package}/json"
>>> response = requests.get(url)
>>> response.raise_for_status()
>>> data = json.loads(response.content)
>>> for url in data["urls"]:
...     print(url["upload_time"])
...
2024-02-14T21:34:37
2024-02-14T21:34:39