Closed spulec closed 10 years ago
Checklist from the README:
It looks like the IG landing URLs have an HTML "text" link as well. Is there some way we can capture this link as well? Maybe we should introduce an 'alt_url' field?
@audiodude, really nice feedback. As for capturing HTML links, some anecdotal checking on my part has reports using one or the other, and this capturing whichever is used, as its url
field. Can you point to one with a URL that's getting overlooked?
Just randomly found one that has a "View Text" link:
http://oig.ssa.gov/audits-and-investigations/audit-reports/A-01-07-27109
It just seems like the project has always erred on the side of getting more data and it's a shame to leave these links on the <table>
.
Never mind, I see them all here.
It looks like the "Text" and "Summary Text" links are identical (probably a bug)?
I am more concerned about getting the descriptive text on investigation landing pages like this one -- they are press releases, but also the only material released about the investigation. It'd be appropriate as a summary
or description
field (have we used one of them before?).
Right now, this investigation has this data:
{
"agency": "ssa",
"agency_name": "Social Security Administration",
"file_type": "html",
"inspector": "ssa",
"inspector_url": "http://oig.ssa.gov",
"landing_url": "http://oig.ssa.gov/audits-and-investigations/investigations/2-south-carolina-women-sentenced-their-roles-separate",
"published_on": "2013-10-17",
"report_id": "2-south-carolina-women-sentenced-their-roles-separate",
"title": "Two South Carolina Women Sentenced for Roles in Separate Social Security Fraud Cases",
"type": "report",
"url": "http://oig.ssa.gov/audits-and-investigations/investigations/2-south-carolina-women-sentenced-their-roles-separate",
"year": 2013
}
Ideally, landing_url
and url
are never identical. I think the descriptive text should be captured, but the report marked as unreleased
-- as it stands, this investigation is a FOIA lead for anyone interested in the full report.
Other than adding the 'extra' report urls, I believe I have addressed the rest of the comments in my last two commits.
It would be nice to add the other linked documents as extra fields even if nothing downstream is currently using them. I think this is the second or third time something like this has come up so it's probably worth establishing a standard.
:+1: Thanks, @spulec!
Adds the OIG for the Social Security Administration.