codeforboston / clean-slate-data

MIT License
27 stars 13 forks source link

Investigate alternatives to PA as proxy for analysis #58

Closed mikemahoney218 closed 4 years ago

mikemahoney218 commented 4 years ago

From work done on #31, it appears that six states -- WV, VT, CA, NJ, KY, NH -- are consistently better matches (from FBI data alone) than PA. We should look into obtaining data from these (particularly VT and NH) in case we're able to get data for a better proxy than PA

mikemahoney218 commented 4 years ago

I've dibs'd NJ, happy for other collaborators to work on other states

mikemahoney218 commented 4 years ago

Update on Kentucky: there is ability to search for cases via https://kcoj.kycourts.net/CourtNet/Search/Index , but it looks like results are returned via async pdf generation (and cost money), so I doubt they'd publicly expose the API used to generate the PDFs

mikemahoney218 commented 4 years ago

Update on New Jersey: there is ability to search dockets, but hidden behind two layers of captcha's which logs you out automatically after six minutes. Could spend more time looking for the API endpoint though, in the event they don't protect that once you've auth'd in

mikemahoney218 commented 4 years ago

Update on New Hampshire: Criminal cases post 2005 are on PACER, which charges a 0.10c per page fee (and you download PDFs, not JSON or any sensible format)

mikemahoney218 commented 4 years ago

Pausing on state-by-state investigations in order to check out https://github.com/freelawproject/juriscraper further

mikemahoney218 commented 4 years ago

See sub-issues #100, #101, #102, #103, #104, #105

jeremylang commented 4 years ago

No longer need a proxy state now that MA data is available.