unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
107 stars 21 forks source link

Empty strings for PDF metadata #93

Closed konklone closed 10 years ago

konklone commented 10 years ago

It's not clear to me how it could end up with empty strings, but this came in:

{
  "agency": "education",
  "agency_name": "Department of Education",
  "file_type": "pdf",
  "inspector": "education",
  "inspector_url": "https://www2.ed.gov/about/offices/list/oig/",
  "pdf": {
    "creation_date": "",
    "modification_date": "",
    "page_count": 10,
    "title": "Audit B19O0003 -  OIG's Independent Report on the Department's De.."
  },
  "published_on": "2014-01-03",
  "report_id": "B1900003",
  "title": "OIG's Independent Report on the Department's Detailed Accounting of...",
  "type": "report",
  "url": "https://www2.ed.gov/about/offices/list/oig/auditreports/fy2014/b19o00..."
  "year": 2014
}

Better for them to be null or not appear (crashed my Elasticsearch importer because of schema issues).

konklone commented 10 years ago

Fixed by https://github.com/unitedstates/inspectors-general/commit/31da59e461b0a6ce2dd700d84a69a13a50fd0358.