DaniFdezAlvarez / shexer

Apache License 2.0
26 stars 2 forks source link

Incorrect SHACL shapes #148

Closed Remya-Ramachandran closed 11 months ago

Remya-Ramachandran commented 1 year ago

I have tried to generate SHACL shapes for the attached ontology which is of the format RDF/XML. ontology.docx

There are several issues in the shapes generated. Please refer "shaper_example" for the shapes generated. shaper_example.docx

When we try to validate the shapes using pyshacl, there are numerous constraint violations due to the incorrect shapes.

Note: Code to generate the shape and validate the shape(with pyshacl) is given below.

Issues in the shapes:

  1. :person a sh:NodeShape

a. No property shapes for data property restriction "has_id some string"
b. [ a sh:PropertyShape ; sh:nodeKind sh:IRI ; sh:path default1:owns ] Missing mincount, maxcount and sh:class parameter for the above shape.

c. There are two property shapes for the sh:path , "rdf:type"

[ a sh:PropertyShape ; sh:in ( owl:NamedIndividual ) ; sh:maxCount 1 ; sh:minCount 1 ; sh:path rdf:type ]

and [ a sh:PropertyShape ; sh:in ( default1:person ) ; sh:maxCount 1 ; sh:minCount 1 ; sh:path rdf:type ]

Since an instance of a person will be of rdf:type "person" and "owl:individual" the above shapes will result in violations (please refer the highlighted violation in yellow in the attached document "".) pyshacl Validation Report.docx

  1. :house a sh:NodeShape ;

mincount missing for sh:path "has_address"

  1. :student a sh:NodeShape ;

a. No property shapes for data property restriction "has_id some string"
b. Since student is a subclass of person, student shape should also have the property shapes for data/object property restrictions of the class "person".

4.:NamedIndividual a sh:NodeShape ;

As per the shapes generated,

  1. every individual is of type "person","house" and "student" which is wrong.

    sh:property [ a sh:PropertyShape ; sh:in ( default1:student ) ; sh:maxCount 1 ; sh:minCount 1 ; sh:path rdf:type ], [ a sh:PropertyShape ; sh:in ( default1:house ) ; sh:maxCount 1 ; sh:minCount 1 ; sh:path rdf:type ], [ a sh:PropertyShape ; sh:in ( default1:person; sh:maxCount 1 ; sh:minCount 1 ; sh:path rdf:type ]

Errors due to the above shapes are highlighted in RED.

  1. every individual should have a path "has_address" and "owns". has_address is only for instance of house owns is only for instance of person.

Please take a look at these issues and let us know if there is a solution available for these

Source code used to generate shapes and validate shapes are given below.Kindly take a look and let us know if there is any issues

To generate shapes:

from shexer.shaper import Shaper from shexer.consts import SHACL_TURTLE,RDF_XML,TURTLE import rdflib from rdflib import Graph

shaper = Shaper( all_classes_mode=True, graph_file_input="C:\project\cwa\cwa_rdfxml.owl", input_format=RDF_XML, disable_comments=False)

output_file = "shaper_example.ttl"

shexer_shape = shaper.shex_graph(output_file=output_file, output_format=SHACL_TURTLE )

To validate shape generated using pyshacl:

import pyshacl from pyshacl import validate import rdflib from rdflib import Graph

data_graph = Graph() data_graph.parse("C:\project\cwa\cwa_rdfxml.owl",format='xml')

shacl_graph = Graph() shacl_graph.parse("C:\project\cwa\shaper_example.ttl",format='turtle') // shaper_example.ttl is the shape file generated in step 1

validate_graph = validate(data_graph, shacl_graph = shacl_graph)

conforms, results_graph, results_text = validate_graph print('conforms',conforms)

print('results_text***',results_text)

DaniFdezAlvarez commented 1 year ago

@Remya-Ramachandran Thank you for your feedback and your code to reproduce those problems. I'll review this case and fix the SHACL generation asap, I'll notify in this issue when it is ready

Remya-Ramachandran commented 11 months ago

Thanks @DaniFdezAlvarez.

DaniFdezAlvarez commented 11 months ago

Hi @Remya-Ramachandran. I've just had the chance to take a closer look to your case. Sorry about this late reply, as the sollution is really simple... sheXer is not a software for your use case.

sheXer aims to extract shapes from A-BOX data examples, not ontology definitions. When you run sheXer using that input and all_classes_mode=True, the library looks for instances of any class, explores their neigborhood, and produce a shape for those clases based on their instances' neighborhoods. The result sometimes could be close to what it is supposed to be according to the OWL definitions for classes such as House or Person, but probably not accurate.

If you check the shapes generated from that point of view you'll see why there are missing minCounts (some instance does not use that property, so the restriction of min cardinality is gone) or maxCount (some instance has more than 2 uses of that property, so sheXer uses unbounded as max cardinality), no shape includes has_id (no node with a class declared uses that property), etc.

Here you have some works that I am aware of that may adapt better to your case: they parse an ontology specification with owl axioms and generate SHACL shapes for it: