Open floresbakker opened 3 weeks ago
I've been curious about this behavior, too. I think it's consistent with the SPARQL specification to behave differently when a quads graph is used vs. a triples graph.
The SPARQL 1.1 grammar has the term QuadsNotTriples
(item 51), which indicates a syntax difference in the WHERE
clause.
Your query ...
someQuery = someGraph.query('''
select ?s
where {
?s ?p ?o
}
''')
... would need to become:
someQuery = someGraph.query('''
select ?s
where {
+ GRAPH ?g {
?s ?p ?o
+ }
}
''')
I only think this because of trying to figure out some nuances with the JSON-LD @graph
keyword. My queries written for triples graphs started not returning results when I gave the @graph
JSON key a sibling @id
key. I got results again when throwing in that GRAPH ?g { ... }
wrapper.
I'm not sure offhand where in the SPARQL specification this gets spelled out, though. The word "graph" appears a few hundred times in the document. So I'm curious for how this thread goes.
I've been curious about this behavior, too. I think it's consistent with the SPARQL specification to behave differently when a quads graph is used vs. a triples graph.
The SPARQL 1.1 grammar has the term
QuadsNotTriples
(item 51), which indicates a syntax difference in theWHERE
clause.Your query ...
someQuery = someGraph.query(''' select ?s where { ?s ?p ?o } ''')
... would need to become:
someQuery = someGraph.query(''' select ?s where { + GRAPH ?g { ?s ?p ?o + } } ''')
I only think this because of trying to figure out some nuances with the JSON-LD
@graph
keyword. My queries written for triples graphs started not returning results when I gave the@graph
JSON key a sibling@id
key. I got results again when throwing in thatGRAPH ?g { ... }
wrapper.I'm not sure offhand where in the SPARQL specification this gets spelled out, though. The word "graph" appears a few hundred times in the document. So I'm curious for how this thread goes.
You are referring to rules dealing explicitly with UPDATE or DELETE statements in SPARQL. Those production rules make part of the abstract syntax tree of SPARQL, so one should read it as leaves and branches of a tree, not as nodes that stand on their own (51 > 50 > 48/49 > 38/39/40). See also note #8 in paragraph 19.8.
My issue deals with a SELECT statement. Would be destructive to the SPARQL specification if a graph could not be queried anymore without a graph statement. Fortunately that is not the case.
I have tested this issue with four engines, RDFLib, Speedy, Virtuoso and Jena. Only RDFlib breaks, the rest of the engines give me the expected bindings.
Doesnt work for Dataset
either. I would have expected Graph()
conceals data from its store, from graphs with a different identifier. This behaviour i wouldnt expect from Dataset
. So i would expect the query on the dataset should work:
anotherGraphSameData = Dataset(store=someGraph.store)
someQuery = anotherGraphSameData.query('''
select ?s
where {
?s ?p ?o
}
''')
for row in someQuery:
print (str(row.s))
#still will return nothing
As sidenote the data is correctly parsed, only the query for dataset doesnt work. So this returns GraphString
.
print(anotherGraphSameData.serialize(format="trig"))
Also one can search with given query in the data from the graph itself. So using:
someGraph = Graph(identifier=URIRef("http://example.com/person/graph-1"))
then you will get
http://example.com/person/drewp
http://example.com/person/drewp
But there doesnt seem to be any options for the sparql processor to ignore graph identifiers. See https://github.com/RDFLib/rdflib/blob/b0d7a7dc272bd6c87bbf807d017932b37c1257f7/rdflib/plugins/sparql/processor.py#L117-L124
I first noticed the behavior when I wanted to run PyShacl on trig data in RDFlib. I could never get that to work, despite that PyShacl is able to handle trig files. I suspected the issue might be in RDFlib, so I decided to create the above mentioned example. My workaround is to transform a trig file into a turtle file and then offer this instead to RDFlib/PyShacl. But this is not handy, as each time I want to process some data, I first have to transform the source.
The sparql query works if you use Dataset
with the option default_untion
. @ashleysommer gave some more background info at #2959
from rdflib import *
GraphString = '''
PREFIX eg: <http://example.com/person/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
eg:graph-1 {
eg:drewp a foaf:Person .
eg:drewp eg:says "Hello World" .
}
eg:graph-2 {
eg:nick a foaf:Person .
eg:nick eg:says "Hi World" .
}
eg:ash a foaf:Person .
eg:ash eg:says "Default" .
'''
ds = Dataset(default_union=True)
ds.parse(data=GraphString, format="trig")
someQuery = ds.query('''
select ?s
where {
?s ?p ?o
}
''')
for row in someQuery:
print (str(row.s))
Data in Trig format cannot be processed by RDFLib.
Let us assume the following data including graphs (example copied from RDFlib documentation)
Next, let's parse this data into a Graph object:
Let us query the graph:
If we then go through the result set, there is unexpectedly nothing:
This does not lead to any result, whereas I would expect the following bindings for the variable ?s .
If I prepare the data differently by removing the explicit graphs, I do get the expected results:
Result of the query:
Perhaps I am mistaken in this and I should work in a different way with rdflib graph objects in Python that contain trig data, but it does seem incorrect behavior from purely a triples & sparql point of view.