I might be missing something, but it seems that the order of ResultRows for a SELECT * query is randomized and changing. This is unexpected because other SPARQL engines I work with seem to generally apply the order of variables as they occur in the WHERE block.
Real output:
[rdflib.term.Variable('c'), rdflib.term.Variable('a'), rdflib.term.Variable('b')] (first run)
[rdflib.term.Variable('a'), rdflib.term.Variable('b'), rdflib.term.Variable('c')] (second run)
[rdflib.term.Variable('b'), rdflib.term.Variable('a'), rdflib.term.Variable('c')] (third run)
[rdflib.term.Variable('a'), rdflib.term.Variable('c'), rdflib.term.Variable('b')] (fourth run)
[you get the idea]
(Tested on the HDT edition of DBpedia 2016, created with g = rdflib.Graph(store=rdflib_hdt.HDTStore(rdf_file)), but that shouldn't matter.)
The application is that we run SPARQL queries whose number of variables isn't known in advance, that we return a binding for all variables and that the WHERE block (and the WHERE block only) is provided by the client. I could enforce a constant order by sorting keys (variables) lexicographically, but again, that order might be unexpected to the user as it changes depending on his naming preferences.
I might be missing something, but it seems that the order of ResultRows for a
SELECT *
query is randomized and changing. This is unexpected because other SPARQL engines I work with seem to generally apply the order of variables as they occur in the WHERE block.Sample code:
query="SELECT * { ?a ?b ?c } LIMIT 10"
qres=g.query(query)
print(qres.vars)
Expected output:
[rdflib.term.Variable('a'), rdflib.term.Variable('b'), rdflib.term.Variable('c')]
Real output:
[rdflib.term.Variable('c'), rdflib.term.Variable('a'), rdflib.term.Variable('b')]
(first run)[rdflib.term.Variable('a'), rdflib.term.Variable('b'), rdflib.term.Variable('c')]
(second run)[rdflib.term.Variable('b'), rdflib.term.Variable('a'), rdflib.term.Variable('c')]
(third run)[rdflib.term.Variable('a'), rdflib.term.Variable('c'), rdflib.term.Variable('b')]
(fourth run) [you get the idea](Tested on the HDT edition of DBpedia 2016, created with
g = rdflib.Graph(store=rdflib_hdt.HDTStore(rdf_file))
, but that shouldn't matter.)The application is that we run SPARQL queries whose number of variables isn't known in advance, that we return a binding for all variables and that the WHERE block (and the WHERE block only) is provided by the client. I could enforce a constant order by sorting keys (variables) lexicographically, but again, that order might be unexpected to the user as it changes depending on his naming preferences.