Sharing queries and custom functions

JervenBolleman commented 4 years ago

Why?

SPARQL allows for custom queries. However, they are not easy to port from one implementation to another, or even ship from one instance to an other. At the same time we would like to see reusing shared query fragments to reduce repetition while composing queries.

Previous work

stored query service spin in declaring variables shacl-af

Proposed solution

Create a shareable schema that can be used to describe sparql queries, functions and how their arguments are named.

ex:KtoDegC a sparql:function ;
  sparql:var [ sparql:name "Kelvin" ; 
                sparql:type xsd:Decimal ; 
                sparql:varOrder 1 ] ;
  sparql:sparql """SELECT ?Celsius WHERE { BIND(?Kelvin + 273.15 AS ?Celsius)} """ .
  sparql:exec [ sparql:execLang "java" ;
        sparql:execCode """((BigDecimal) kelvin) -> kelvin.add(new BigDecimal(273.15f))"""
  ] ;
 sparql:exec [ sparql:execLang "ruby" ;
        sparql:execCode """|kelvin| kelvin + 273.15f"""
  ] ;
sparql:result [ sparql:quantity sparql:single ; 
        sparql:var [ sparql:type xsd:Decimal ; 
                    sparql:name "Celsius" ;
                    sparql:varOrder 1 ] .

Not every function will be available in sparql,java or ruby but this is to illustrate the idea.

The sparql:result block is important to qualify what the result coming out of a function should be. If we should expect 1 input gives multiple outputs or just one. The seconds is important e.g. converting a single value into two or more variables (e.g. splitting a string into two values). Also how many bindings the sparql endpoint should expect, the second is important as we want to use the same general idea for sharing queries.

ex:getUsersByName a sparql:Query ;
    sparql:var [ sparql:name "personalName" ; 
                sparql:type xsd:String ; 
                sparql:varOrder 1 ] , [ sparql:name "familyName" ; 
                sparql:type xsd:String ; 
                sparql:varOrder 2 ] ;
   sparql:select [ """PREFIX ex:<...> SELECT ?user WHERE {?user a ex:User ; rdfs:label ?userName }"""] ;
   sparql:result [ sparql:quantity sparql:multiple ; 
        sparql:var [ sparql:type xsd:anyURI ; 
                    sparql:name "userIri" ;
                    sparql:varOrder 1 ] .

This can then be used in queries, using the following strawman.

SELECT ?degC 
WHERE {
   ?x a ex:tempMeasurement ; ex:tempInKelvin ?k .
   VALUES ?degC USING ex:KtoDegC(?k) . #binding using varOrder
}

or

SELECT ?user 
WHERE {
   VALUES ?user USING ex:getUsersByName("Jerven", "Bolleman") .  #binding using varOrder
   ?user a ex:employee .
}

etc.

Considerations for backward compatibility

Query fragments are not understood by SPARQL 1.1.

namedgraph commented 4 years ago

Would SPIN Functions be related work?

maximelefrancois86 commented 4 years ago

LDScript defines SPARQL FUNCTION, which are basically SPARQL Expressions with a name and a signature http://ns.inria.fr/sparql-extension/#function1 . A SPARQL FUNCTION can be called as any SPARQL Binding Function.

In SPARQL-Generate https://ci.mines-stetienne.fr/sparql-generate/ we implemented this proposal, and generalized the approach:

SPARQL GENERATE queries can be given a name and a signature. a SPARQL GENERATE query can be called from within a FROM or FROM NAMED clause.
SPARQL SELECT queries can be given a name and a signature. a SPARQL SELECT query can be called from within the SPARQL-Generate ITERATOR clauses. --> This is also related to #6

JervenBolleman commented 4 years ago

@maximelefrancois86 I was aware of the existence of the proposed extension (issue #52). However, I thought that is a new language, and does not itself encode it's own shared files in RDF. But please correct my impression.

I would see the use of sparql-function in this way.

ex:KtoDegC a sparql:function ;
    sparql:exec [ sparql:execLang "sparql-function" ;
        sparql:execCode """function ex:KtoDegC(?kelvin) {?kelvin + 273.15f}""" .
  ] ;

maximelefrancois86 commented 4 years ago

Alright.

In your snippets, the sparql:execLang property seems to be used to type the literal. I consider that it would be more concise and more appropriate to use datatypes for this sort of practice.

ex:KtoDegC a sparql:function ;
    sparql:var [ sparql:name "kelvin" ; 
                sparql:type xsd:Decimal ; 
                sparql:varOrder 1 ] ;
   sparql:exec """((BigDecimal) kelvin) -> kelvin.add(new BigDecimal(273.15f))"""^^exec:java ,
    """|kelvin| kelvin + 273.15f"""^^exec:ruby ,
    """function ex:KtoDegC(?kelvin) {?kelvin + 273.15f}"""^^exec:sparql-function .

Then the SPARQL engine could use the typed literal with a datatype it recognize.

To me, it would also be equivalent to identify functions with a URL, and rely on dereferencing with content-negotiation. From:

ex:KtoDegC a sparql:function ;
 sparql:var [ sparql:name "Kelvin" ; 
                sparql:type xsd:Decimal ; 
                sparql:varOrder 1 ] .

The SPARQL Engine could dereference ex:KtoDegC and use content negotiation mechanisms to attempt to retrieve some executable code it supports. For exemple, application/javascript, application/vnd+sparql-generate...

VladimirAlexiev commented 3 years ago

Prior art: FNO cc @bjdmeest, @andimou

Links

homepage: https://fno.io/
function catalog: https://fno.io/hub
spec: https://w3id.org/function/spec
ontologies v0.6: https://doi.org/10.5281/zenodo.3383636, DOI 10.5281/zenodo.3383636
source: https://github.com/IDLabResearch/function-ontology, https://github.com/fnoio

Publications:

An Ontology to Semantically Declare and Describe Functions (ESWC 2016 poster)
Detailed Provenance Capture of Data Processing (SemSci ISWC 2017)
Discovering and Using Functions via Content Negotiation (ISWC 2016 posters)
Implementation-independent function reuse (FGCS 2019)

The FNO ontology describes:

Problems (eg "the string concatenation problem")
Functions that solve a problem and have input/output signatures (parameters)
Implementations that implement a function (eg as JS or NPM package, XQuery, Java, etc)
Mappings that tie up functions, implementations, and their parameters
Executions that describe the application of a particular implementation on particular inputs (eg source fields during RML data mapping)

Namespaces:

fno: https://w3id.org/function/ontology#: defines the main classes
fnoi: https://w3id.org/function/vocabulary/implementation#: provides extra implementation details, eg JavaClass, JsonApi
fnom: https://w3id.org/function/vocabulary/mapping# includes extra mapping details, eg positional vs named parameters, default vs exceptional output (return), etc
fns: http://users.ugent.be/~bjdmeest/function/functions.ttl# defines instances to describe standard SPARQL 1.1 and XPath functions (but the "hub" catalog includes more functions, in particular from OpenRefine GREL)

FNO conceptual model:

FNO example: a sumFunction that takes inputs startValue and sumValue and produces output sumResult

VladimirAlexiev commented 3 years ago

@JervenBolleman This is an important topic, but too large to discuss in one issue. I vote to split it to two:

Functions: describing (what problem they solve; interface), implementing (SPARQL; as a fragment in some language, vs as class & method name)
Stored queries, with parameterization (#57)

w3c / sparql-dev