Closed garethr closed 4 years ago
Splitting assets out into a separate table totally makes sense to me. They can still be fetched as part of the releases
command.
None of my own releases use assets (they are all pushed to PyPI instead) but I spotted that your project here uses assets, so I'll test against that: https://github.com/instrumenta/conftest/releases/tag/v0.18.0
github-to-sqlite releases releases.db instrumenta/conftest
Each asset looks like this:
{
"url": "https://api.github.com/repos/instrumenta/conftest/releases/assets/11811946",
"id": 11811946,
"node_id": "MDEyOlJlbGVhc2VBc3NldDExODExOTQ2",
"name": "checksums.txt",
"label": "",
"uploader": {
"login": "garethr",
"id": 2029,
"node_id": "MDQ6VXNlcjIwMjk=",
"avatar_url": "https://avatars2.githubusercontent.com/u/2029?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/garethr",
"html_url": "https://github.com/garethr",
"followers_url": "https://api.github.com/users/garethr/followers",
"following_url": "https://api.github.com/users/garethr/following{/other_user}",
"gists_url": "https://api.github.com/users/garethr/gists{/gist_id}",
"starred_url": "https://api.github.com/users/garethr/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/garethr/subscriptions",
"organizations_url": "https://api.github.com/users/garethr/orgs",
"repos_url": "https://api.github.com/users/garethr/repos",
"events_url": "https://api.github.com/users/garethr/events{/privacy}",
"received_events_url": "https://api.github.com/users/garethr/received_events",
"type": "User",
"site_admin": false
},
"content_type": "text/plain; charset=utf-8",
"state": "uploaded",
"size": 600,
"download_count": 2,
"created_at": "2019-03-30T16:56:44Z",
"updated_at": "2019-03-30T16:56:44Z",
"browser_download_url": "https://github.com/instrumenta/conftest/releases/download/v0.1.0/checksums.txt"
}
That looks great, thanks!
The
releases
command extracts the releases table, but data about the individual assets are locked up in the JSON document in theassets
field. My main interest is in individual and aggregate download counts. I was wondering if creating a new table with a record per asset may be useful? If so I'm happy to send a PR when I get a moment. Do you have opinions about that simply being part of thereleases
command or would you prefer a separate command as well?