Esri / geoportal-server

Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
https://gptogc.esri.com/geoportal
Apache License 2.0
244 stars 149 forks source link

Indeterminate date produces odd output in DCAT #254

Closed torrin47 closed 7 years ago

torrin47 commented 7 years ago

We implemented 1.2.7 in part to address issue #224 and are now seeing great results with determinate dates in our dcat output: https://edg.epa.gov/metadata/rest/find/document?f=dcat&searchText=timeperiod.meta%3Aisdeterminate

and unknown and invalid dates are appropriately being suppressed. But we have one example of an indeterminate date (2004-present) which is not being handled appropriately: "2004-01-01/292269055-12-02" https://edg.epa.gov/metadata/rest/find/document?f=dcat&searchText=timeperiod.meta%3Aisindeterminate

Thoughts on how to address this?

pandzel-zz commented 7 years ago

The "292269055-12-02" value is a result of translation "present" into some value storeable in the Lucene index. Perhaps this trick is good for searching for metadata.

The real issue is how to express "present" in DCAT terms, or to be more accurate, how to express "present" as ISO 8601 date (292269055-12-02 clearly is not a valid ISO date). It turn's out that there are no means to do that, so I decoded to discard any temporal with "present" as one of it's interval value. The code is in the github already.

torrin47 commented 7 years ago

Definitely an amusing translation, and thanks for the quick fix, discarding it should be fine, but how's this for consideration? If the value is legitimately "Present" (and we assume the data actually is being updated in close to real time) couldn't we translate "Present" into a datestamp of "Now" equivalent to the instant the DCAT file is generated?

mhogeweg commented 7 years ago

I think that would not be the right thing to do. remember that Data.gov (or other sites) harvest the metadata and they would then see an arbitrary date in the metadata. leaving this blank appears better.

pandzel-zz commented 7 years ago

To my best understanding this is all a lack of a precise definition of the standard. Instead of leaving this blank as Marten suggested, I would rather talk to DCAT people about the issue and perhaps they may come with a better notation next time. Beside, a similar or related issue has already been discussed here: https://github.com/project-open-data/project-open-data.github.io/issues/415

torrin47 commented 7 years ago

I agree their proposed solution is awkward - repeating interval as time period? I prefer the first solution they mentioned in passing...

What you can do with 8601 is specify an ambiguous end date (eg 2005-05-30/2016 where you just know the end date is in 2016) or you can use repeating intervals.

@mhogeweg, does that offend your sensibilities less than the specific harvest date?

mhogeweg commented 7 years ago

torrin will submit a pull request with the EPA modification. we will then include that as a configurable choice for how to deal with 'present' date values