Closed spulec closed 10 years ago
Well now you've really made things interesting! The GAO's not an IG, and in fact it's not even in the same branch of government -- the GAO is a legislative agency that reports to Congress. But it also does very much the same kind of work, and I freely mingled IG and GAO reports when I integrated them into Scout (in a past life).
Doesn't seem worth starting up a whole separate project just because it's not technically what the project name says. So yeah, let's integrate GAO.
In that past life, I also once wrote a GAO scraper in Ruby, and in so doing I discovered that GAO has an undocumented content API:
The ID can be found on scraping the result list, and then you can use the API to get precise details for each report. I think it's probably the less brittle way to get the bulk of the data. What do you think?
Also, can I just say, it is so frustrating that GAO has HTTPS configured, but deliberately stops and redirects you down to HTTP.
...and I just got that this is the GAO's IG, not GAO itself. Right. OK. Will test.
Looks terrific, @spulec, thank you!
I guess I shouldn't be surprised, but the GAO has their ducks in a row.