Closed lahoffm closed 6 years ago
Please also add a README.md explaining how to install/run it and basic output to expect.
I got this email from Dena Bearden DBearden@columbusga.org: "Our docket shows the day of arrest. If you needed more info you would need to do an open request."
I'll work on this one
If you click "open in new tab" on the next page button you can go to the individual pages of the docket as a URL without having to simulate clicks. https://ccgapps1.columbusga.org/appl/MCSOJailInmateInformation.nsf/Web14DayIntake?OpenView&Start=1&Count=10 https://ccgapps1.columbusga.org/appl/MCSOJailInmateInformation.nsf/Web14DayRelease?OpenView&Start=1&Count=10
The Muscogee County scraper is going pretty well. I should have a functional version in the next few days when I have some free time.
I just finished creating a functional version. It might need a few tweaks.
What kind of tweaks? What stuff is not yet in CONTRIBUTING.md format? It would help if you gave us a list of things that remain to be done.
I or @rimjieun will review your code in detail at some point.
I tweaked it a little bit after submitting that comment and fixed a lot of what I was thinking of that needed to be tweaked. There are still a couple things I think might be issues in the future. 1.) The page it is scraping from changed url on me before so it probably needs to navigate from the menu page instead of using the current hardcoded urls . 2.) The notes field is not implemented yet. My functions typically return empty strings if they encounter something unexpected. Whenever that happens it could put a message in the notes field.
Finally got to reviewing Muscogee. Everything looks good for the most part. Just a few minor things I noticed:
data
folder.Also, @jttew can you provide an example in the README.md for writing the chromedriver path? I had a bit of trouble running the scraper in the beginning because I tried using the path, although I ended up not using any path (maybe because I already had it setup in my environment variables).
If no unique URL I just put the same URL for everything, the main county URL. If severity is never provided at all for any charges,, OK to leave blank.
I changed the CSV output to the data folder, fixed the url typo, and decided to add the chromedriver to the project files in the Muscogee folder.
Accidentally used my other account to comment earlier...
Submitted PR for current time stamp fix.
And everything looks good!
Starting data collection, multiple times per day.
For the URL changing, I’ll monitor output daily & change URL myself in the code. If it changes frequently enough to be annoying, I might ask if you can add the extra navigation code.
Not closing issue till it passes #18 but I consider it done barring any bugs.
Thank you for your help @jttew!
Make webscraper that spits out a CSV.
Please make a subfolder under src/webscraper that is the county name and contains all your code (Python 3.6) so no merge conflicts.
https://www.columbusga.org/sheriff/InmateSearch.htm
They have a 14 day intake docket and 14 day release docket. For current roster they just have an inmate name search.