Build a directory full of files into a SQLite database
⚠️ Early alpha preview. Everything about this tool is likely to change.
Install this tool using pip
or pipx
:
pipx install datasette-build
This will provide the datasette-build
CLI application.
You can also install it as a Datasette plugin. First install Datasette, then run:
datasette install datasette-build
This will provide a datasette build ...
command that works the same as the datasette-build
CLI application.
Or you can install it as a plugin for sqlite-utils. With that installed, run this:
sqlite-utils install datasette-build
Now you can access the tool as sqlite-utils build ...
The datasette-build
(or datasette build
or sqlite-utils build
) command takes two arguments: a path to a SQLite database file and a path to a directory containing files to be loaded into that database:
datasette-build mydatabase.db myfiles/
The myfiles/
folder can contain a mixture of CSV, TSV and JSON files. Each file will be loaded into a table in the mydatabase.db
SQLite database.
The database file will be created if it does not already exist.
Consider a myfiles/cities.csv
file like this:
id,name,latitude,longitude,country
nyc,New York City,40.7128,-74.006,US
lon,London,51.5074,-0.1278,GB
tok,Tokyo,35.6895,139.6917,JP
par,Paris,48.8566,2.3522,FR
ber,Berlin,52.52,13.405,DE
syd,Sydney,-33.8688,151.2093,AU
cai,Cairo,30.0444,31.2357,EG
rio,Rio de Janeiro,-22.9068,-43.1729,BR
mos,Moscow,55.7558,37.6173,RU
mum,Mumbai,19.076,72.8777,IN
Since this has a id
column the primary key for the table will be set to id
. Without an id
column the primary key will not be defined.
A myfiles/counties.tsv
file could look like this:
id name population
US United States 331002651
GB United Kingdom 67886011
JP Japan 126476461
FR France 65273511
DE Germany 83783942
AU Australia 25499884
EG Egypt 102334404
BR Brazil 212559417
RU Russia 145934462
IN India 1380004385
And a myfiles/museums.json
file like this:
[
{
"id": 1,
"name": "Metropolitan Museum of Art",
"city_id": "nyc"
},
{
"id": 2,
"name": "British Museum",
"city_id": "lon"
}
]
Running datasette-build mydatabase.db myfiles/
will create a SQLite database file containing three tables: cities
, counties
and museums
. The schema will look like this:
CREATE TABLE [museums] (
[id] INTEGER PRIMARY KEY,
[name] TEXT,
[city_id] TEXT
);
CREATE TABLE "cities" (
[id] TEXT PRIMARY KEY,
[name] TEXT,
[latitude] FLOAT,
[longitude] FLOAT,
[country] TEXT
);
CREATE TABLE "countries" (
[id] TEXT PRIMARY KEY,
[name] TEXT,
[population] INTEGER
);
To set up this plugin locally, first checkout the code. Then create a new virtual environment:
cd datasette-build
python3 -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest