Closed svenvanderburg closed 3 years ago
Hi @svenvanderburg PyShacl was designed specifically not to mutate the data graph when validating, especially when using the inference options. There are very careful checks in place to ensure what you've described does not happen. (There are actually old feature requests from users asking for an inplace expansion on their graph, which I have rejected). If you have come across a case where using inferencing does mutate your original data graph, then this is a major bug, and should be treated as such.
I have run some basic tests locally and cannot reproduce the behaviour you described, so I will have to do some deeper digging to get to the bottom of it. Are you able to share a simple snippet of code that reproduces the behaviour you're seeing?
Hi @ashleysommer OK, I did not expect that this was a bug, sorry for my miserable report in that case :) And glad we agree on that it shouldn't mutate, haha!
Here is a minimal example:
def test_pyschacl():
g = rdflib.Graph()
g.add((rdflib.URIRef('http://www.example.org/garfield'),
rdflib.RDF.type,
rdflib.URIRef('http://www.example.org/cat')))
len_before = len(g)
pyshacl.validate(g, shacl_graph=None, inference='rdfs')
assert len(g) == len_before
# Which outputs:
AssertionError: graph was mutated
7 != 1
Versions of relevant libraries used:
pyshacl==0.14.0
rdflib==5.0.0
Hi @svenvanderburg I've created a new release with a fix for this problem. Released as v0.14.1
I've also added an "inplace" mode, as you suggested, which defaults to False
.
Amazing! 👍 Thanks for the quick action!
Setting the inference option (I tested with 'rdfs') results in the
data_graph
being mutated, it is expanded using inference.Technically this is documented in the
Readme.md
: _inference is a Python string value to indicate whether or not to perform OWL inferencing expansion of the datagraph before validation. But still I would not expect avalidate
function to have side-effects. Maybe there are benefits of this behavior that I am unaware of, but it would make more sense for me if under the hood thedata_graph
object is copied (usingcopy.deepcopy
), and all inference expansions operate on the copy. You could toggle this behavior with aninplace
argument that is by defaultFalse
.