unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
107 stars 21 forks source link

Missing published dates documentation #123

Closed spulec closed 10 years ago

spulec commented 10 years ago

As we get down to some of the last agencies, more and more seem to have incomplete information around published dates. It would be nice to come up with an "order of operations" for ways to try to deal with these websites.

Some possible ideas in no particular order:

Am I missing any? Thoughts on an order?

If we can come to some agreement, I'm happy to write up some additional docs.

konklone commented 10 years ago

Barring a refactor to support downloading PDFs before validating metadata, the first 3 definitely come before the last 2. But the order of those first 3 seems completely determined by the quality of the IG's data. Some of the IGs, using Last-Modified is just too often wrong to be acceptable, and some IGs have too many reports that'd need hard-coding.

So I think it's two classes of rules, with instructions to analyze the IG for which among the first class should be used first.

spulec commented 10 years ago

Okay, so generally:

I think that actually covers all of the agencies I've seen so far so let's just ignore the last two for the time being. I'll make a small modification to the readme and then close this for now.

spulec commented 10 years ago

Added with 0f2f40b95d0737f8240eddb8bdc1ee6d1a9e8a17

konklone commented 10 years ago

:+1: