Generic SocrataScraper that can be configured for any dataset by a json constructor.
Driver script that iterates through a json array to run all scrapers defined in socrata-config.json
Config file allows definition of multiple endpoints and date/metric field names for a single scraper
Right now, I have this configured to scrape from baltimore, chicago, and new york city (both endpoints), but I think (hope) it can handle any of our current socrata-backend cities
Left a TODO in SocrataScraper.py where I think the metric conversion could fit well. Hopefully we can put most of our effort there now
Sample run:
Running scraper...
Hitting: https://data.baltimorecity.gov/resource/4ih5-d5d5.json
Total entries pulled: 102432
Wrote 4384 entries to data/bal-crime-2015.csv
Wrote 4483 entries to data/bal-crime-2016.csv
Wrote 520 entries to data/bal-crime-2017.csv
Constructed SocrataScraper for chi-crime
Running scraper...
Hitting: https://data.cityofchicago.org/resource/6zsd-86xi.json
Total entries pulled: 553699
Wrote 7786 entries to data/chi-crime-2015.csv
Wrote 7782 entries to data/chi-crime-2016.csv
Wrote 762 entries to data/chi-crime-2017.csv
Constructed SocrataScraper for nyc-crime
Running scraper...
Hitting: https://data.cityofnewyork.us/resource/9s4h-37hy.json
Total entries pulled: 468576
Hitting: https://data.cityofnewyork.us/resource/7x9x-zpz6.json
Total entries pulled: 945321
Wrote 12566 entries to data/nyc-crime-2015.csv
Wrote 12496 entries to data/nyc-crime-2016.csv```
Right now, I have this configured to scrape from baltimore, chicago, and new york city (both endpoints), but I think (hope) it can handle any of our current socrata-backend cities
Left a TODO in SocrataScraper.py where I think the metric conversion could fit well. Hopefully we can put most of our effort there now
Sample run: