simonw / scrape-open-data

Scrape various open data directories to create an index of what's available out there
https://open-data.datasette.io
28 stars 2 forks source link
git-scraping socrata

scrape-open-data

Scrape latest data

Scrapes every available dataset from Socrata and stores them as newline-delimited JSON in this repository, to track changes over time through Git scraping.

The resulting database is deployed to https://open-data.datasette.io/

scrape_socrata.py

Run python scrape_socrata.py socrata/ to scrape the data from Socrata and save it in the socrata/ directory.

Add --stats to include page view and download statistics in separate files.

Add --verbose for verbose output.

build_socrata_db.py`

Run this command to build a SQLite database from the .jsonl files in socrata/:

python build_socrata_db.py socrata.db socrata