Closed cldellow closed 1 year ago
Maybe Northwinds DB?
It's under MSPL: https://web.archive.org/web/20170623074454/https://northwinddatabase.codeplex.com/license
Someone has a SQLite port of it: https://github.com/jpwhite3/northwind-SQLite3/issues
also chinook, a weird mix of ERP and ... music? https://www.sqlitetutorial.net/sqlite-sample-database/
https://openparliament.ca/data-download/ is a PG dump. We could maybe convert it, but it's 5gb uncompressed, so probably not a great choice for a demo.
OTOH, it'd have lots of "real" data...
almost certainly a bad idea: could we pull extracts of wikipedia, eg using https://github.com/spencermountain/wtf_wikipedia ?
Could we instead write some SPARQL queries against wikidata? SPARQL looks like it has a vertical learning curve...
cia factbook: https://github.com/factbook/factbook.sql
cars: https://github.com/abhionlyone/us-car-models-data/blob/master/1992.csv
very thin data, but many rows, and bonus json tag array
maybe check r/datasets ?
USDA food: https://github.com/alyssaq/usda-sqlite
Early dumps of stack exchange sites: https://tejp.de/files/so/dbdump/
I think the early dumps of SE sites is an interesting way to go, it has:
My chief complaint is maybe that it's too small to be a really good test. I think I'd prefer like 50K rows in its biggest table. Ah well, let's see what happens.
Yes, let's go with Stack Exchange. See https://github.com/cldellow/stackexchange-to-sqlite
https://github.com/cldellow/dux-demo, I haven't tried to automate things yet. Dunno if I'll stick with fly, we'll see.
This does make me want another, bigger database. The cooking DB is pretty zippy, something 10x as big would be a nice proof that things scale
superuser is ~14x the size... that's maybe a bit too much? Still need room for an FTS index, eg.
OTOH, "annoying" is probably the right qualitative test for a big database.
Plus, if I eventually script this into a GH action, I'll stop noticing.
Maybe deploy to fly?
Resources:
A manual deploy is fine to start, although if we're doing something with DBs, it would be good to script that bit.
Is there an interesting, open source database we can use to demo? Ideally it'd have good faceting and searching... maybe pictures?