linked-art / linked.art

Development of a specification for linked data in museums, using existing ontologies and frameworks to build usable, understandable APIs
https://linked.art/
Other
90 stars 13 forks source link

EDTF dates #218

Closed ajs6f closed 2 months ago

ajs6f commented 5 years ago

@marilenadaquino mentioned that EDTF is undergoing standardization. We should see to it that we are apprised of that process and when it completes we should examine the spec to see if it helps us express temporal uncertainty.

azaroth42 commented 5 years ago

Defer until there's a standard with broader support both in RDF and in implementations.

ajs6f commented 5 years ago

After as long as RDF has been around, do you think we're likely to get something that is a no-contest adoption choice? I'm wondering myself…

workergnome commented 5 years ago

I think there's likely a productive mapping between OWL:Time and EDTF, but I have not been particularly happy with the EDTF group--they're very tired of working on it, and they're also very secretive about what they're actually doing.

ajs6f commented 5 years ago

Perhaps we should close this specific ticket in favor of a more general ticket to discuss temporal description? To my mind, we have to work on that problem for any real interoperability.

azaroth42 commented 5 years ago

CIDOC-CRM specifies that the timestamps are xsd:datetimes, and then there's begin/end_of_the_begin/end for giving a timespan with fuzzy beginning and end.

There would need to be a very convincing reason to start to contradict the spec. We can discuss the use cases for timestamps in a different ticket for sure!

beaudet commented 5 years ago

Some interesting points about BCE with XSD:datetimes here: https://stackoverflow.com/questions/47562736/should-xsddate-convert-for-datetimes-before-common-era-in-sparql

azaroth42 commented 5 years ago

Yep. Not good for Archaeology, as the effective range is Jan 1st, 9999 BCE through Dec 31st, 9999 CE. However, given our scope is artwork, we would need some human created art object from 10,000 BCE before we couldn't represent the date.

ajs6f commented 5 years ago

I'll be more worried about matching dates, but we would need actual data to really learn anything about that, so there's no need to worry about it yet.

ewg118 commented 5 years ago

It should be noted that while the XSD 1.1 datatype scheme may have sought to align BC dates more toward ISO 8601, in reality, software applications adhere to the 1.0 datatypes. Inserting an object with an ISO 8601/XSD 1.1 of 1 BC (0000) results in an error in Fuseki. It's hard to say what other triplestores do. Adherence to 1.1 vs. 1.0 datatypes probably varies wildly across specific implementations. I don't even know if there's a difference between XSLT 2.0 and 3.0 in this regard. But I can tell you that Javascript itself is based on ISO 8601, and there are very fundamental problems in Javascript in doing date-based math for BC dates (e.g., plotting timelines in d3js or similar libraries).

-9999 being out of range is less of a problem for us than the very real scenario in which we model BC dates for objects (which are numerous in our museums) according to ISO 8601, but our software applications don't actually handle them appropriately, leading to confusion about the actual start and end dates.

ewg118 commented 5 years ago

EDTF would need to be taken up by a W3C group and integrated into XSD, in my opinion. It can't be accommodated in our data because our software systems aren't designed for it.

azaroth42 commented 5 years ago

Propose close, no possible change that we can affect in practice.

azaroth42 commented 5 years ago

Defer for 2 weeks, pending proposal from @workergnome.

ewg118 commented 5 years ago

I think we should encourage users to map EDTF ranges into actionable RDF datatypes as best as possible so that date-based math and sorting can be peformed on data rather than inserting amorphous literals. This means you may want to reserve the CRM properties for dates (begin of the begin, end of the end, etc.) for machine-readable dates and implement a different property entirely for EDTF dates. dcterms:date? DCTerms says the following about date: "Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601 [W3CDTF].". EDTF is a standard encoding scheme, just not necessarily a machine actionable one within a LOD environment.

ewg118 commented 5 years ago

https://github.com/schemaorg/schemaorg/issues/1365 seems somewhat relevant.

workergnome commented 5 years ago

Looking at EDTF, it looks like it is being subsumed into ISO 8601-2019, which means that this date format is likely to see wider adoption over the next several years. It will not supercede XSD:Date in the short term, so I agree that we should not let this take over the useful begin_of_the_begin et al properties. But there is still benefit to having properties that allow us more expressivity in dates, particularly in a standarized, machine-readable format.

My proposal is that we use the two existing CIDOC properties, P81 ongoing throughout and P82 at some time within, mapped to throughout and sometime_within, and that these two properties be let to take on the domain of "string". We could also then either use RDF typing or use conforms_to to indicate that the dates are of format EDTF, linked to http://www.loc.gov/standards/datetime until such time as there is a formal ISO URL for them, mapped to edtf.

this could look like

{
  "type": "Activity",
  "timespan": {
    "type": "TimeSpan",
    "_label": "sometime between 1650 and January 2001 until October 15, 2006",
    "sometime_within": {"value": "1650-XX-XX/2006-10-15", "type": "edtf"},
    "throughout": {"value": "2001-1-XX/2006-10-15", "type": "edtf"}
  }

or

{
  "type": "Activity",
  "timespan": {
    "type": "TimeSpan",
    "_label": "sometime between 1650 and January 2001 until October 15, 2006",
    "sometime_within":  "1650-XX-XX/2006-10-15",
    "throughout": "2001-1-XX/2006-10-15",
    "conforms_to" "edtf"
  }

Using intervals here both of these dates means that we can capture all four dates, while reserving the other properties for XSD.

I loosely prefer the first syntax, since it means that we could use both of our date formats together. It also means that we could use other date formats here without prescribing anything explicit.

{
  "type": "Activity",
  "timespan": {
    "type": "TimeSpan",
    "_label": "sometime between 1650 and January 2001 until October 15, 2006",
    "sometime_within": {"value": "1650-XX-XX/2006-10-15", "type": "edtf"},
    "throughout": {"value": "2001-1-XX/2006-10-15", "type": "edtf"},
    "begin_of_the_begin": "1650-01-01T00:00:00.000Z",
    "end_of_the_begin": "2001-01-31T23:59:59.999Z",
    "begin_of_the_end": "2006-10-15T00:00:00.000Z",
    "end_of_the_end": "2006-10-15T23:59:59.999Z"
  }
workergnome commented 5 years ago

It looks like the interval syntax in schema, brought up by Ethan, is the same syntax as the new EDTF. You can see that in the "differences" section at http://www.loc.gov/standards/datetime/edtf.html.

azaroth42 commented 5 years ago

Could you further describe the value of this? It seems like duplication of the information in the existing properties, which would then simply introduce inconsistencies. The XXs being more obvious that the values are unknown, rather than the false precision of the ranges from the a/b properties?

workergnome commented 5 years ago

Much of the benefit is that EDTF is more expressive than XSD:Datetime. It lets you specify things such as uncertainty, approximation, and varying precisions. That's less clear in the case above, but it lets you say things like:

"timespan": {
      "type": "TimeSpan",
      "_label": "circa the 16th century until sometime between the 1950s and January 1980?",
      "sometime_within": {"value": "15~/1980-01?", "type": "edtf"}, 
      "throughout":  {"value": "15~/195", "type": "edtf"} 
      "begin_of_the_begin": "1500-01-01T00:00:00.000Z",
      "end_of_the_begin": "1599-12-31T23:59:59.999Z",
      "begin_of_the_end": "1950-01-01T00:00:00.000Z",
      "end_of_the_end": "1980-01-31T23:59:59.999Z",
      "id": "#timespan"
    }

The xsd:DateTime is still useful, particularly for computation, but it loses both the circa and the uncertainty of the end date as well as the level of precision that each section of that date has.

azaroth42 commented 4 years ago

Propose close:

aisaac commented 4 years ago

@ewg118 notes that some publishers do have EDTF data.

aisaac commented 4 years ago

WG discussion 28-1-2020: we agree it is better to wait and see

azaroth42 commented 2 months ago

Waited ... no one has spoken of it again. Closing.