okfn / ckanext-lacounts

CKAN extension for the LA Counts project
GNU Affero General Public License v3.0
8 stars 5 forks source link

modified date metadata field reflects harvest date (not date last modified by publisher/portal) #206

Closed emillipede closed 5 years ago

emillipede commented 5 years ago

for example, in this dataset https://www.lacounts.org/dataset/tsunami-inundation-zones the last modified date is Tuesday 26 March. on the portal site, there's no record of an update http://geohub.lacity.org/datasets/ffaf33ba67264818a729dc97a384c064_6

emillipede commented 5 years ago

please clarify what "modified date" reflects and if there was a modification to this dataset on/around 26 March

thank you!

amercader commented 5 years ago

@emillipede The "Modified" date in the Metadata section reflects the modification date advertised by the publisher in the metadata we harvest automatically.

In the Tsunami Inundation Zones dataset this is how it looks like on the JSON-LD dump that we harvest:

    {
      "@type": "dcat:Dataset",
      "identifier": "http://geohub.lacity.org/datasets/ffaf33ba67264818a729dc97a384c064_6",
      "title": "Tsunami Inundation Zones",
      "description": "Area modelled to be inundated by a tsunami",
      "keyword": [
        "Los Angeles",
        "LA",
        "County of LA",
        "hazards",
        "safety",
        "earthquake hazards",
        "tsunami hazards",
        "boundaries",
        "a safe city"
      ],
      "issued": "2015-11-17T01:48:09.000Z",
      "modified": "2019-03-26T16:14:59.928Z",
      "publisher": {
        "name": "City of Los Angeles Hub"
      },
      "contactPoint": {
        "@type": "vcard:Contact",
        "fn": "Kirk Bishop",
        "hasEmail": "mailto:mayor.opendata@lacity.org"
      },
      "accessLevel": "public",

      [...]

      "landingPage": "http://geohub.lacity.org/datasets/ffaf33ba67264818a729dc97a384c064_6",
      "webService": "http://public.gis.lacounty.gov/public/rest/services/LACounty_Dynamic/Hazards/MapServer/6",
      "license": "https://hub.arcgis.com/api/v2/datasets/ffaf33ba67264818a729dc97a384c064_6/license",
      "spatial": "-118.944802415,33.7001774114,-118.088643462,34.0499505044",
      "theme": [
        "geospatial"
      ]
    }

Note the modified key. I know that this is not consistent with the date shown on the geohub page, and that it probably reflects some edit done on the GeoHub metadata, but we have no way of knowing this when storing it on our side.

So to sum up, Modified stores the last update date that the publisher provides us on the metadata.

emillipede commented 5 years ago

Thank you for investigating @amercader , good to know. I'll follow up with @carlacasilli to see if she'd like to address this possibility in the site FAQ

carlacasilli commented 5 years ago

Thanks for this info, @amercader and @emillipede. Yes, Emily, please add an explanation to the Datasets portion of the FAQs. Thanks!