clarin-eric / VLO

Virtual Language Observatory
GNU General Public License v3.0
14 stars 6 forks source link

Improved content of facet TemporalCoverage #115

Open teckart opened 6 years ago

teckart commented 6 years ago

Relevant topics contain inspection of current instances content, support of popular standards (e.g. "1940-1945") and consistent mapping to internal formats (probably ODRF + W3C DateTime).

twagoo commented 6 years ago

Note. Europeana provides time span information in EDM metadata (which we convert to CMDI) in the form of semium.org/time identifiers (from the seemingly defunct AnnoCultor project) for the edm:TimeSpan type. It appears to be systematic as well as flexible, but the documentation has disappeared (can be found via archive.org). We could check with them to find out more about the current status.

An example (EDM converted to CMDI):

<dc-date>
 <edm-TimeSpan rdf-about="http://semium.org/time/1888">
  <skos-prefLabel>1888</skos-prefLabel>
  <dcterms-isPartOf rdf-resource="http://semium.org/time/18xx_4_quarter"/>
  <edm-begin>Sun Jan 01 01:00:00 CET 1888</edm-begin>
  <edm-end>Mon Dec 31 01:00:00 CET 1888</edm-end>
 </edm-TimeSpan>
</dc-date>
twagoo commented 6 years ago

Zipped RDF/SKOS for this ontology time-1.0.1.zip

This dataset is based on DBPedia data, retrieved from http://wiki.dbpedia.org/ on 01 August 2011.
The original data is distributed under the Creative Commons Attribution-ShareAlike 3.0 license, 
http://creativecommons.org/licenses/by-sa/3.0/

This package is derivative work, a result of data selection and processing.
It is distributed here under the same Creative Commons License, 
http://creativecommons.org/licenses/by-sa/3.0/

Also see https://github.com/europeana/tools/tree/master/annocultor_solr4/converters/vocabularies/time

teckart commented 4 years ago

As there is no real standard in the CMDI world for encoding time spans yet, a pragmatic approach might be to extract all temporal information and store only the extrema for the whole instance. This would be robust regaring alignment problems (for complex components or in case of multiple timespans per record), but would also create a larger timespan out of multiple discontinuous timespans.

twagoo commented 4 years ago

For the record, I have created a branch issue115-twan that 1) was rebased to the current development branch and 2) adds two fields for the start/end values of the temporal coverage date ranges. In the end, the idea is that these can be used to get 'stats' and use the min/max values to set the bounds of the date range selector. I don't have time to implement the actual logic to populate the fields, but maybe it can help as a stub for front end development.