Open afs opened 1 year ago
The WG discussed this issue during the telecon of 2023-12-14.
Coming back to this issue, another idea came to my mind: if I read Section 3.2.7.4 Order relation on dateTime of XML Schema Part 2 correctly, two dateTimes without timezone are compared as if they were in the same timezone (they share the same timeline).
Following the reasoning above, this is not really appropriate for RDF... Two independently produced dateTime values with no timezone may actually have been produced in two different locations. Wouldn't it be better to consider that timezone-less dateTime are never equal (they might still be lower than if the difference is >14h)?
That's a reference to XML Schema 1.0 - which is the link in F&O 3.1 in some places. XML Schema 1.1 is different -- 2.2.3 Order -- XML Schema 1.1 -- and also references by F&O.
What is the status of XSD 1.0 and XSD 1.1? SPARQL 1.2 migrated to referencing 1.1 but looking at Functions and Operators 3.1 there are links to 1.0 as well.
Thoughts about idea that timezone-less dateTime are never equal.
xsd:dateTimes
. One can argue it shouldn't but. We would be changing existing behaviour. A FILTER (?x < "2024-09-18T06:00:00"^^xsd:dateTime
is always false (error is EBV false).xsd:date
which very rarely has a timezone.<
and sorting are connected but different because non-comparability. There is a wish to have defined sorting. Having sorting comparison and <
differ may be confusing (even if technically a correct extension).SPARQL uses Functions & Operators for comparison and ordering, not XSD.
op:date-less-than
is a comparison of the starting instants on the timeline. I read that as meaning it is not the XSD-defined comparison directly (1.0 effectively ignores timezone on xsd:date). It defined as comparison of xsd:dateTime
. "The starting instant of an xs:date is the xs:dateTime at time 00:00:00 on that date." That would be timezone sensitive.
This network of specs is complicated. My reading may be wrong.
Conclusion: There is no perfect answer. The second-best option is a defined answer.
Somewhere between perfect and second-best would be a followup to the other standards and their respective organizations, toward better definitions and/or handling guidance that would benefit all specs that now depend on these incomplete standards.
If I understand it properly, XPath Function & Operators states a well defined behavior that is:
Hence, I am not sure there is much room for improvements in XPath F&O.
If we want to stay close to XPath F&O this leaves us with three options imho (I might miss some others):
ORDER BY
ordering but not for comparison operationsSomewhere between perfect and second-best would be a followup to the other standards and their respective organizations, toward better definitions and/or handling guidance that would benefit all specs that now depend on these incomplete standards.
The organisation is W3C.
The specs are not vague - they are (IMO) complicated but they do define something. There is history there.
My interpretation of this:
If there is no timezone set, the literal becomes a time period with the extreme timezones as the bounds of that period. Then, logically:
My proposal would be to allow in SPARQL to select the bounds of something that will be interpreted as a period using special functions that could select the lower or upper bound from the period. This way the SPARQL query writer has full control of how they want to specifically interpret the time literals that will be interpreted as a period in the SPARQL engine.
Issue #86 updates SPARQL to reference F&O version 3.0.
One issue results from that is the use of implicit timezones in comparisons and sorting noted in https://github.com/w3c/sparql-query/issues/86#issuecomment-1566143387 reproduced here:
In the RDF context, where data is on the web and can be drawn from multiple sources, there isn't a natural timezone, nor will it be the same as the request origin. Even a single data source, data collected over time, is affected because of DST.
For SPARQL, it is useful for sorting because it gives a total order.
For comparisons, the implicit timezone is less useful. RDF Concepts refers to XML Schema 1.1. The indeterminate comparison order at least does not give false information. I don't see much use of xsd:dateTimeStamp.
We could choose to say that there is no implicit timezone by default for comparison (i.e. XML Schema rules) and suggest its use for ordering. Maybe also say implementations MAY (RFC 2119) provide an implicit timezone with certain consequences.
We'd need text about this and it is shame not to be able to just refer to F&O but overall I think it's worth it.
If instead we choose to have a timezone, there ought to be only one. +00:00 (c.f. cloud provide server clocks.) for same answers everywhere.