sebferre / sparklis

Sparklis is a query builder in natural language that allows people to explore and query SPARQL endpoints with all the power of SPARQL and without any knowledge of SPARQL.
Apache License 2.0
41 stars 10 forks source link

Support year filters #7

Closed kad-beekw closed 3 years ago

kad-beekw commented 3 years ago

Observed

When a dataset contains xsd:gYear values, the current filtering approach in Sparklis does not work. When we try to filter for buildings between 1900 and 1910, Sparklis uses the following filter:

FILTER ( (  xsd:double(?thing_38) >= 1900
         && xsd:double(?thing_38) <= 1910 ) )

^ Unfortunately, this filter is not allowed in SPARQL 1.1, because date/time (including year) values are not allowed to be cast to doubles.

Expected

Filters over values with datatype IRI xsd:gYear to work in Sparklis

Possible solutions

Possible solution 1

The following does work in SPARQL 1.1:

FILTER ( (  xsd:double(year(xsd:dateTime(?thing_38))) >= 1900
         && xsd:double(year(xsd:dateTime(?thing_38))) <= 1910 ) )

^ Notice that we need to do the following:

  1. We must case xsd:gYear values to xsd:dateTime. It is not clear from the SPARQL standard whether this is required, but the endpoints that we use seem to require this. (The SPARQL standard only explicitly mentions the case for xsd:dateTime.)
  2. From a xsd:dateTime we are allowed to extract the year component as an integer, using the year() function in SPARQL 1.1.
  3. Integers are allowed to be case to doubles.

Possible solution 2

According to the SPARQL 1.1 standard, the following filter should also work. However, this does not seem to be supported by all endpoint. I tested this with Virtuoso where this did not work, and tested this with Jena where this did work.

filter(?year >= "1900"^^xsd:gYear && "1910"^^xsd:gYear)
sebferre commented 3 years ago

I am surprised that conversion from xsd:gYear to xsd:double is not supported given that a year is an integer, which is a double. It is supported on Fuseki endpoints for instance.

The implicit conversion is necessary because it is common to have numeric values represented (wrongly) as simple literals. The simplest solution will probably to offer the lexical comparison operators (after, before, from.. to ..), which corresponds to your Solution 2.

sebferre commented 3 years ago

Finally, I found a simpler and more generic version of your Possible solution 1. xsd:double(str(?thing)) to convert ?thing to a number, whatever the datatype. Note that the comparison filters are only available when there are literals that can be parsed as numbers.

Here is a query on your endpoint showing the fix : permalink