Add possibility to add Variable Mapping to the result set of Sparqlify approach

GezimSejdiu commented 6 years ago

As the SANSA query API provides as well possibilities to write queries directly without exposing one endpoint :

val triples = spark.rdf(input)(path)
val query = "SELECT * WHERE {?s ?p ?o} LIMIT 10"
val result = triples.sparql(triples)

Here we get as a result in a data frame of bindings and would be great if we provide a wrapper which map variables to the result set.

Best

Aklakan commented 6 years ago

So the clean way to get the result as a table would be for sparql query execution yield essentially a (RDB2RDF) Mapping - i.e.:

a spark dataset (conceptionally a table) with a 'natural' (by convention) representation of the values
- Example conventions: uris become strings, RDF integers become scala integers, in the case of language tags, the column name of the value matches that of the variable, etc
a mapping how to construct the exact correct SPARQL Bindings from the dataset - i.e. ?myVar = STRLANG(?label, ?label_lang)

The API could look like this:

val datasetMapping = triples.sparql(query)

val naturalDataset: Dataset[Row] = datasetMapping.dataset
val mapping: Map[Var, Expr] datasetMapping.mapping
val bindingDataset: Dataset[Binding] = datasetMapping.asBindings

Aklakan commented 5 years ago

Maybe we can solve this issue with a simple variable substitution: The final mapping is comprised of (a) the sql query (or its algebra expression) and the (multi) mapping of each sparql variable to a set of defining expressions:

?s = {uri(?foo), plainLiteral(?bar), ...}
?o = {typedLiteral(?baz, my:datatype) }

So by analyzing this mapping, we can decide which columns are needed, and apply a substitution on the sql expression to tidy up variable names. Ontop seems to use nice variable names from the group up; maybe the substitution is not needed there.

Aklakan commented 4 years ago

Please continue discussion at #47

SANSA-Stack / Archived-SANSA-Query

Add possibility to add Variable Mapping to the result set of Sparqlify approach #13