buda-base / editor-templates

SHACL templates for the BUDA editor
MIT License
1 stars 0 forks source link

PersonShapes graph parsing error #16

Closed MarcAgate closed 4 years ago

MarcAgate commented 4 years ago

We have a strange parsing exception when Trying to instanciate a Shapes object from PersonShapes graph:

org.apache.jena.shacl.parser.ShaclParseException: No sh:path on a property shape: <http://purl.bdrc.io/ontology/shapes/core/ContentLocationShape>
    at org.apache.jena.shacl.parser.ShapesParser.findPropertyShapes(ShapesParser.java:285)
    at org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:214)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:196)
    at org.apache.jena.shacl.parser.ShapesParser.parseRootShape(ShapesParser.java:140)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:84)
    at org.apache.jena.shacl.Shapes.parse(Shapes.java:55)

the code to reproduce this error is:

@Test
    public void parseShape() {
        Model m = ModelFactory.createDefaultModel();
        m.read("http://purl.bdrc.io/graph/PersonShapes.ttl", null, "TTL");
        Shapes shapes = Shapes.parse(m.getGraph());
    }
MarcAgate commented 4 years ago

More info:

person.shapes.ttl is parsed w/o any issues while parsing fails with person.local.shapes.ttl

MarcAgate commented 4 years ago

Running the following code:

@Test
    public void parseShape() throws MalformedURLException, FileNotFoundException {
        Model m = ModelFactory.createDefaultModel();
        m.read("https://raw.githubusercontent.com/buda-base/editor-templates/master/templates/core/person.local.shapes.ttl", null, "TTL");       
        Shapes shapes = Shapes.parse(m.getGraph());
    }

leads to this parsing exception:

org.apache.jena.shacl.parser.ShaclParseException: SPARQL parse error: Line 5, column 39: Unresolved prefixed name: rdf:type

        select distinct $this
        where {
            filter not exists { $this rdf:type bdo:Gender . } .
        }     

    at org.apache.jena.shacl.engine.SparqlConstraints.parseSparqlConstraint(SparqlConstraints.java:90)
    at org.apache.jena.shacl.parser.Constraints.lambda$static$23(Constraints.java:248)
    at org.apache.jena.shacl.parser.Constraints.parseConstraint(Constraints.java:125)
    at org.apache.jena.shacl.parser.Constraints.parseConstraints(Constraints.java:114)
    at org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:206)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:196)
    at org.apache.jena.shacl.parser.ShapesParser.parseRootShape(ShapesParser.java:140)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:90)
    at org.apache.jena.shacl.Shapes.parse(Shapes.java:55)
xristy commented 4 years ago

That error is not surprising and should occur.

The definition:

bds:RDF 
    a sh:PrefixDeclaration ;
    sh:declare [ 
        sh:prefix "rdf" ;
        sh:namespace "http://www.w3.org/1999/02/22-rdf-syntax-ns#" ;
    ] ;
.

is in base.shapes.ttl which is imported via root.local.shapes.ttl which is imported by event.local.shapes.ttl which is imported by person.local.shapes.ttl; hence the error when trying to parse the file person.local.shapes.ttl without processing the imports

MarcAgate commented 4 years ago

Ok, Good.

xristy commented 4 years ago

I've updated shapes-testing with ShaclParse_JS01 (pretty much the same kernel of code that you've used).

I get output showing apparently the same parse error that you're seeing on both PersonLocalShapes and PersonShapes - which at least seems reasonable in that PersonShapes includes all of PersonLocalShapes and so whatever the error in PersonLocalShapes it is not being corrected or covered up by PersonShapes:

===> Parsing PersonLocalShapes
===> FAILED to Parse PersonLocalShapes
org.apache.jena.shacl.parser.ShaclParseException: No sh:path on a property shape: <http://purl.bdrc.io/ontology/shapes/core/ContentLocationShape>
    at org.apache.jena.shacl.parser.ShapesParser.findPropertyShapes(ShapesParser.java:285)
    at org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:214)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:196)
    at org.apache.jena.shacl.parser.ShapesParser.parseRootShape(ShapesParser.java:140)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:84)
    at org.apache.jena.shacl.Shapes.parse(Shapes.java:55)
    at ShaclParse_JS01.parseGraph(ShaclParse_JS01.java:28)
    at ShaclParse_JS01.main(ShaclParse_JS01.java:45)
===> Parsing PersonShapes
===> FAILED to Parse PersonShapes
org.apache.jena.shacl.parser.ShaclParseException: No sh:path on a property shape: <http://purl.bdrc.io/ontology/shapes/core/ContentLocationShape>
    at org.apache.jena.shacl.parser.ShapesParser.findPropertyShapes(ShapesParser.java:285)
    at org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:214)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:196)
    at org.apache.jena.shacl.parser.ShapesParser.parseRootShape(ShapesParser.java:140)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:84)
    at org.apache.jena.shacl.Shapes.parse(Shapes.java:55)
    at ShaclParse_JS01.parseGraph(ShaclParse_JS01.java:28)
    at ShaclParse_JS01.main(ShaclParse_JS01.java:46)
MarcAgate commented 4 years ago

Regarding bdg:PersonShapes graph validation, I have noticed the following:

When using TQ validation validateMethod, as in TopQ_ValidationTest_MA02, using http://purl.bdrc.io/graph/PersonShapes.ttl as the source graph for building shapes graph, I get the following exception:

Exception in thread "main" org.topbraid.shacl.validation.SHACLException: Invalid SPARQL constraint (Line 6, column 48: Unresolved prefixed name: rdfs:subClassOf):
PREFIX bdo: <http://purl.bdrc.io/ontology/core/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

        select distinct $this
        where {
            filter not exists { $this rdf:type/rdfs:subClassOf* bdo:Note . } .
        }     

    at org.topbraid.shacl.validation.sparql.AbstractSPARQLExecutor.<init>(AbstractSPARQLExecutor.java:75)
    at org.topbraid.shacl.validation.sparql.SPARQLConstraintExecutor.<init>(SPARQLConstraintExecutor.java:42)
    at org.topbraid.shacl.validation.ConstraintExecutors.lambda$new$3(ConstraintExecutors.java:57)
    at org.topbraid.shacl.validation.ConstraintExecutors.getExecutor(ConstraintExecutors.java:81)
    at org.topbraid.shacl.engine.Constraint.getExecutor(Constraint.java:98)
    at org.topbraid.shacl.validation.ValidationEngine.validateNodesAgainstConstraint(ValidationEngine.java:488)
    at org.topbraid.shacl.validation.ValidationEngine.validateAll(ValidationEngine.java:368)
    at org.topbraid.shacl.validation.ValidationUtil.validateModel(ValidationUtil.java:121)
    at org.topbraid.shacl.validation.ValidationUtil.validateModel(ValidationUtil.java:102)
    at TopQ_ValidationTest_MA02.main(TopQ_ValidationTest_MA02.java:80)

Note: This shows and proves that the parsing of the shapes graph is triggered only after the ValidationUtil.validateModel method has been called.

When using TQ ValidationEngine.validateNode() as in TopQ_ValidateNode_MA01, I have no exception and the validation process goes through normally, producing empty or full validation reports, depending upon various validation cases.

This shows and proves that TQ validates PersonShapes in certains circonstances and throws a parsing error in some other cases. So, either parsing method are different (in ValidationUtil.validateModel and ValidationEngine.validateNode()) OR the parsing "level" or "requirements" are different when validating a Node or a full Resource Model.

xristy commented 4 years ago

The problem reported in TQ

Unresolved prefixed name: rdfs:subClassOf

is fixed as of 5e8ca3b

This does not alter the spurtious parse error in JS

xristy commented 4 years ago

The original issue was:

org.apache.jena.shacl.parser.ShaclParseException: No sh:path on a property shape: <http://purl.bdrc.io/ontology/shapes/core/ContentLocationShape>
    at org.apache.jena.shacl.parser.ShapesParser.findPropertyShapes(ShapesParser.java:285)
    at org.apache.jena.shacl.parser.ShapesParser.parseShape$(ShapesParser.java:214)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapeStep(ShapesParser.java:196)
    at org.apache.jena.shacl.parser.ShapesParser.parseRootShape(ShapesParser.java:140)
    at org.apache.jena.shacl.parser.ShapesParser.parseShapes(ShapesParser.java:84)
    at org.apache.jena.shacl.Shapes.parse(Shapes.java:55)

which on the face of it made no sense because a sh:path is not needed on a NodeShape which is how bds: ContentLocationShape is defined.

The issue was that there were a couple of property shape refs that were not defined out of some 20 such refs:

bds:ContentLocationShape
  a sh:NodeShape ;
  bds:identifierPrefix "CL" ;
  rdfs:label "ContentLocation Shape"@en ;
. . .
  sh:property bds:ContentLocationShape-contentLocationStatement ;
  sh:property bds:ContentLocationShape-contentLocationStatementCBETA ;
. . .
  sh:targetClass bdo:ContentLocation .

since for each of these there was no:

bds:ContentLocationShape-contentLocationStatement sh:path some:propertyPath ;

It is an error in the Jena Shacl implementation, BUT the parse error message is singularly un-helpful. It should indicate the problem is with

bds:ContentLocationShape-contentLocationStatement

NOT with bds:ContentLocationShape not having an sh:path on a property shape.

I will report this to Jena. This has taken far too much time to resolve.

Further, the TopQuadrant implementation is completely deficient for these sorts of issues (I doubt I'm the only one to have made this sort of error). TQ never complains about such situations.

Firstly, TQ is very lazy in checking definitions of shapes until and unless it has some data that triggers a shape; whereas, Jena Shacl has a clearly defined parse step that will pick out ill-defined shapes graphs.

Secondly, even when triggered TQ is perfectly happy to ignore the issue since it apparently has a different approach to deciding whether it should attempt to validate something like bds:ContentLocationShape-contentLocationStatement. It just reports that the data conforms to the shapes graph as is.

This is singularly unhelpful. I would expect that when TQ sees an instance of bdo:ContentLocation in the data then it would check that each of the property shape refs conform.

Evidently TQ considers that a putative property shape conforms if there's no triple in the shape graph that is violated by the data and since there are no triples for bds:ContentLocationShape-contentLocationStatement then the shapes graph says nothing that is violated by the data graph.

OTOH, JS seems more strict in that if a ref to a property shape is made then JS actually requires that some minimal set of triples relevant to bds:ContentLocationShape-contentLocationStatement must present in the shapes graph.

I'll take JS over TQ on this.

eroux commented 4 years ago

Great, thanks! I agree, let's report to Jena and use it

xristy commented 4 years ago

Fixed per:

From: Andy Seaborne andy@apache.org Subject: Re: misleading parse exception message in Shacl. Date: July 16, 2020 at 4:20:07 PM CDT To: users@jena.apache.org Reply-To: users@jena.apache.org

Fixed in 3.16.0:

"shacl parse" gives:

No sh:path on a property shape: node=http://example/bdsContentLocationShape sh:property http://example/bdsContentLocationShape-contentLocationStatement

when there exists at least one triple with bds:ContentLocationShape-contentLocationStatement as subject

and

Missing property shape: node=http://example/bdsContentLocationShape sh:property http://example/bdsContentLocationShape-contentLocationStatement

if there are none

@MarcAgate I this can be closed now.

MarcAgate commented 4 years ago

Thanks !