RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.15k stars 555 forks source link

Blank node context mixed in SPARQL update #892

Open white-gecko opened 5 years ago

white-gecko commented 5 years ago

Using the same blank id in two queries will merge the blank nodes in the graph. This is not correct since the context for a blank node in a sparql query is limited to the query. The SPARQL 1.1 spec says:

That is, the INSERT DATA statement only allows to insert ground triples. Blank nodes in QuadDatas are assumed to be disjoint from the blank nodes in the Graph Store, i.e., will be inserted with "fresh" blank nodes. (https://www.w3.org/TR/2013/REC-sparql11-update-20130321/#insertData)

See the following in example:

import rdflib
g = rdflib.Graph()
g.serialize(format="turtle").decode("utf-8")
# empty

g.update('INSERT DATA { _:a <urn:label> "A bnode" }')
g.serialize(format="turtle").decode("utf-8")
# @prefix ns1: <urn:> .
# [] ns1:label "A bnode" .

g.update('INSERT DATA { _:a <urn:label> "Bnode 2" }')
g.serialize(format="turtle").decode("utf-8")
# @prefix ns1: <urn:> .
# [] ns1:label "A bnode",
#        "Bnode 2" .

I would expect:

…
g.update('INSERT DATA { _:a <urn:label> "Bnode 2" }')
g.serialize(format="turtle").decode("utf-8")
# @prefix ns1: <urn:> .
# [] ns1:label "A bnode" .
# [] ns1:label "Bnode 2" .
gauri-b commented 4 years ago

Hi @white-gecko, Creating a new BNode identifier for each triple parsed seems to solve this problem, is that what we are aiming for, or did I miss something?

white-gecko commented 4 years ago

Just creating a new BNode identifier for each triple would not solve the issue because this would loose the context for if triples should refer to the same blank node.