Closed mikemahoney218 closed 4 years ago
The year is hardcoded to 2018 in this, and also a separate script in the same repo was used to establish what the range of numbers were in order to do a random sample: https://github.com/EnMedina/Clean-Slate/blob/master/datascraper/RandomCaseGetter.py
No longer need a proxy state now that MA data is available.
Our PA dataset was scraped from public-facing sources using code stored in a Jupyter notebook (which @sheldonchan has access to). We're hoping to find that notebook, in order to get PA data for additional years.
We should also quantify how many records from what timespan we are hoping to scrape.