sethmlarson / pypi-data

Data about packages and maintainers on PyPI
Apache License 2.0
122 stars 8 forks source link

Package URLs table: add URL names #16

Closed hugovk closed 2 years ago

hugovk commented 2 years ago

Fixes #13. Fixes #14.

First, refactor the startup stuff into if __name__ == "__main__": so we can import main.py and easily unit test it instead testing against the live API.

Should we run the tests in GitHub Actions?

Move the fetching of project_urls into a new function: get_project_urls(resp["info"])

Similar to before, we initialise the list with:

    names_urls = [
        ("bugtrack_url", info.get("bugtrack_url")),
        ("docs_url", info.get("docs_url")),
        ("Downloads", info.get("download_url")),
        ("Homepage", info.get("home_page")),
        ("project_url", info.get("project_url")),
    ]

It's now a list of (name, URL) tuples instead of just URLs, because we're also going to save the name into the database.

Demo

$ sqlite3 'pypi.db' 'SELECT package_name, name, url  FROM package_urls LIMIT 50;'
0-orchestrator|Homepage|https://github.com/g8os/grid
0-core-client|Homepage|https://github.com/zero-os/0-core
0-618|Homepage|https://github.com/qu6zhi/0.618
000|Homepage|https://github.com/username/wu.git
021|Homepage|http://www.headfirstlabs.com
0-0-1|Homepage|https://github.com/scotthuang1989/image2tfrecords
0805nexter|Homepage|http://www.hp.com
0121|Homepage|http://test.me
00000a|Homepage|http://test.me
0101|Homepage|https://github.com/username/wu.git
0html|Homepage|https://github.com/0oio/0html
0lever-so|Homepage|https://github.com/0lever/so
0rest|Homepage|https://github.com/0oio/0rest
0imap|Homepage|https://github.com/0oio/0imap
0proto|Homepage|https://github.com/example/0proto
0x-middlewares|Homepage|https://github.com/0xproject/0x-monorepo/tree/development/python-packages/middlewares
0x-contract-addresses|Homepage|https://github.com/0xproject/0x-monorepo/tree/development/python-packages/contract_addresses
0lever-utils|Homepage|https://github.com/0lever/utils
0x-json-schemas|Homepage|https://github.com/0xProject/0x-monorepo/tree/development/python-packages/json_schemas
0rss|Homepage|https://github.com/mindey/0rss
0x-order-utils|Homepage|https://github.com/0xProject/0x-monorepo/tree/development/python-packages/order_utils
0x-contract-artifacts|Homepage|https://github.com/0xproject/0x-monorepo/tree/development/python-packages/contract_artifacts
0x-contract-wrappers|Homepage|https://github.com/0xproject/0x-monorepo/tree/development/python-packages/contract_wrappers
0x01-autocert-dns-aliyun|Homepage|https://github.com/Smart-Hypercube/autocert
0x-python|Homepage|https://github.com/skeetzo/0x-python
0x-web3|Homepage|https://github.com/ethereum/web3.py
0x-sra-client|Homepage|https://github.com/0xproject/0x-monorepo/tree/development/python-packages/sra_client
0x01-cubic-sdk|Homepage|https://github.com/Smart-Hypercube/cubic-sdk
0xmpp|Homepage|https://github.com/0oio/0xmpp
0x01-letsencrypt|Homepage|https://github.com/Smart-Hypercube/autocert
0x10c-asm|Homepage|https://github.com/severb/0x10c-asm
1000pip-climber-system-free-download|Homepage|https://bfca1bsjtxsz0qao2rakno2q6w.hop.clickbank.net/?tid=pypi
1000pip-climber-system-free-download|Bug Tracker|https://github.com/issues
1000pip-climber-system-download|Homepage|https://156544mlxov28levev7grc9v9g.hop.clickbank.net/?tid=py
1000pip-climber-system-download|Bug Tracker|https://github.com/issues
1000pip-builder|Homepage|https://bf031bojwy-4fr5xcr6mowbkfg.hop.clickbank.net/?tid=p
1000pip-builder|Bug Tracker|https://github.com/issues
1000pip-builder-forex-signals|Homepage|https://1cc5e9nl-wm84kdoa-ckkk3w4q.hop.clickbank.net/?tid=py
1000pip-builder-forex-signals|Bug Tracker|https://github.com/issues
101703048-topsis|Downloads|https://github.com/AkritiSehgal/101703048_topsis/archive/v_2.0.1.tar.gz
101703048-topsis|Homepage|https://github.com/AkritiSehgal/101703048_topsis
101703087-outlier|Downloads|https://github.com/anukritigarg13/101703087_outliers/archive/v_1.0.0.tar.gz
101703087-outlier|Homepage|https://github.com/anukritigarg13/101703087_outliers/tree/v_1.0.0/101703087_outliers
100bot|Homepage|https://github.com/PeppyHare/100bot
100bot|Bug Reports|https://github.com/PeppyHare/100bot/issues
100bot|Source|https://github.com/PeppyHare/100bot/
101703088-outlier|Downloads|https://github.com/Anurag-Aggarwal/Outliers/archive/V-1.0.0.tar.gz
101703088-outlier|Homepage|https://github.com/Anurag-Aggarwal/Outliers
1000pip-climber-system-review|Homepage|https://bfca1bsjtxsz0qao2rakno2q6w.hop.clickbank.net/?tid=pypi
1000pip-climber-system-review|Bug Tracker|https://github.com/issues
hugovk commented 2 years ago

You're welcome, thanks for pypi-data!

There were some questions too, which I can deal with in follow-ups if you like:

Should we run the tests in GitHub Actions?

And:

sethmlarson commented 2 years ago

I'm ++ on everything you've suggested. Let's run tests in GHA and remove the URLs that are on the way out or not useful. Docs_url can be kept with a name of "docs_url" maybe?

hugovk commented 2 years ago

Yep, sounds good.