Closed simonw closed 4 years ago
I could fix this by putting REFRESH_DB
in a commit message:
But... doing so would lose the data I've collected in https://github-to-sqlite.dogsheep.net/github/dependents?_sort_desc=first_seen_utc concerning the first time each dependent repo was spotted.
I think the best fix is to download the github.db
database, manually fix it and then manually deploy it to Cloud Run from my laptop.
Manually fixing the database:
$ wget 'https://github-to-sqlite.dogsheep.net/github.db'
--2020-07-18 15:52:33-- https://github-to-sqlite.dogsheep.net/github.db
Resolving github-to-sqlite.dogsheep.net (github-to-sqlite.dogsheep.net)... 172.217.5.115
Connecting to github-to-sqlite.dogsheep.net (github-to-sqlite.dogsheep.net)|172.217.5.115|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14626816 (14M) [application/octet-stream]
Saving to: ‘github.db’
github.db 100%[============================================================================>] 13.95M 1.22MB/s in 18s
2020-07-18 15:52:53 (773 KB/s) - ‘github.db’ saved [14626816/14626816]
$ sqlite3 github.db
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> drop table tags;
sqlite> ^D
$ github-to-sqlite tags github.db simonw/datasette simonw/sqlite-utils dogsheep/healthkit-to-sqlite dogsheep/swarm-to-sqlite dogsheep/twitter-to-sqlite dogsheep/inaturalist-to-sqlite dogsheep/google-takeout-to-sqlite dogsheep/github-to-sqlite dogsheep/genome-to-sqlite dogsheep/pocket-to-sqlite dogsheep/hacker-news-to-sqlite dogsheep/dogsheep-photos
$ sqlite-utils tables github.db --counts
[{"table": "users", "count": 4048},
{"table": "repos", "count": 210},
...
{"table": "stars", "count": 4140},
{"table": "tags", "count": 188}]
$ sqlite-utils rows github.db tags
[{"repo": 107914493, "name": "0.45", "sha": "f1f581b7ffcd5d8f3ae6c1c654d813a6641410eb"},
{"repo": 107914493, "name": "0.45a5", "sha": "676bb64c877d73f8ff496cef4632f5a8a5a9283c"},
...
Deploying the fixed version like this:
$ gcloud config set run/region us-central1
$ gcloud config set project datasette-222320
$ datasette publish cloudrun /tmp/github.db \
-m demo-metadata.json \
--service github-to-sqlite \
--install=py-gfm \
--install='datasette-search-all>=0.3' \
--install='datasette-render-markdown>=1.1.2' \
--install=datasette-pretty-json \
--install=datasette-json-html \
--install=datasette-vega
https://github-to-sqlite.dogsheep.net/github/tags now shows a repo
column instead of a repo_id
column.
https://github.com/dogsheep/github-to-sqlite/runs/885773677
That's because I changed the name in #44. I thought this would be safe since no-one else could possibly be using this yet (it hadn't shipped in a release) but turns out I broke my demo!