CivicTechAtlanta / georgia-courtbot

Helping people remember to attend court to help break the cycle of fines and jail time
4 stars 5 forks source link

Update scraper to work around 200ish record limit #24

Closed bbrewington closed 2 years ago

bbrewington commented 2 years ago

Scraper: https://github.com/codeforatlanta/georgia-courtbot/blob/main/data/dekalb_scraper.py

Example search on 2022-02-02 that hit the limit: Screen Shot 2022-02-02 at 7 00 35 PM

bbrewington commented 2 years ago

Here's what we need to look for - there's a boolean attribute "MaxResultsHit"

Screen Shot 2022-02-03 at 10 24 28 PM

Maybe we could build in some kind of loop where it dials down the days parameter until "MaxResultsHit" is false:

https://github.com/codeforatlanta/georgia-courtbot/blob/35a3212c99770e7240565fdd1883e5343d29b319/data/dekalb_scraper.py#L111

bbrewington commented 2 years ago

@abrie nice, thanks! looks great (ran it locally and there's now 383 cases for Judge Gregory Adams...previously there were 199) <-- also, the example cases provided by Rachel are now showing up