Closed abrie closed 2 years ago
Could you check to see if the fields shown in issue https://github.com/codeforatlanta/georgia-courtbot/issues/5 are captured?
I noticed we're grabbing CASE_ID
but not CASE_NUMBER
. the latter is important b/c that's what someone is going to input in SMS / web form sign up
Thought JUDICIAL_OFFICER
would be useful since we're using that for (pagination?) --> in case we need to re-run a chunk of requests at some point
Ran a test locally outputting to CSV, and it worked great:
python3 dekalb_scraper.py --output csv > dekalb_scrape_202201170827.csv
^ uploaded that file to Google Sheets...will share link in Slack
going to handle above extra fields request in a separate PR
This PR adds CSV output as an option to the scraper. CSV is well suited for importing into a databases.
When running the scraper, specify the output format using the '--output' argument:
python dekalb_scraper.py --output {json,csv}
Here is an example that populate the 'cases' table in an sqlite3 database:
python dekalb_scraper.py --output csv | sqlite3 database.sqlite3 ".import --csv /dev/stdin cases"