unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
106 stars 21 forks source link

[lsc] Rewrite scraper for new website #235

Closed divergentdave closed 8 years ago

divergentdave commented 8 years ago

Fixes #233. Some older reports, especially those related to the "mapping project," have broken links. I plan on dealing with those as part of #204, by extracting the relevant files from https://www.oig.lsc.gov/images/mapping/mapping.zip.

konklone commented 8 years ago

Tested, and it works great. Do you have a sense of whether the unique IDs have changed between versions of the site/scraper?

divergentdave commented 8 years ago

I didn't pay too much attention to that, but I think most IDs stayed the same, while some changed. In most cases, we still use the report numbers, or failing that, the filenames. Both numbers and filenames were preserved, as far as I could tell. I did have to handle a new edge case between audit report 00-001 and special report 00-001, so there were some changes.