unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
107 stars 21 forks source link

Added osc.py, Office of Special Counsel. 1,137 reports since 2009 #275

Closed lukerosiak closed 8 years ago

lukerosiak commented 8 years ago

I have created a scraper for Office of Special Counsel, an important entity that investigates whistleblower retaliation and other prohibited practices across all federal agencies. The scraper parses 1,137 reports going back to 2009.

Next, if you want it, I plan to contribute a module that will be the most effective way of solving your "incorporating manually FOIA'd reports" problem. It will be a scraper for GovernmentAttic.org, which has over 2,000 FOIA'd reports, mostly IG reports, and is regularly updated with new ones.

konklone commented 8 years ago

I have created a scraper for Office of Special Counsel, an important entity that investigates whistleblower retaliation and other prohibited practices across all federal agencies. The scraper parses 1,137 reports going back to 2009.

This is outstanding. Thank you so much for this contribution! Yes, we will definitely accept this.

Next, if you want it, I plan to contribute a module that will be the most effective way of solving your "incorporating manually FOIA'd reports" problem. It will be a scraper for GovernmentAttic.org, which has over 2,000 FOIA'd reports, mostly IG reports, and is regularly updated with new ones.

I definitely welcome that contribution too, though it will be a bit more complicated. In small part because it's an unofficial source, but in large part because the quality of the documents I've seen there tends to be really poor and will need a lot of OCRing. But it's also a huge trove of super relevant documents (including the names of a ton of unreleased IG reports), so it's definitely worth including here if you're going to write it.

konklone commented 8 years ago

@lukerosiak Mind tweaking your PR to address the build breakage? We use pyflakes, and it's flagged the os and urljoin imports as unused:

https://travis-ci.org/unitedstates/inspectors-general/builds/106132793

lukerosiak commented 8 years ago

Of course--done. Thank you for building such an important resource. I will open an issue about GovernmentAttic.