ucoProject / UCO

This repository is for development of the Unified Cyber Ontology.
Apache License 2.0
74 stars 34 forks source link

UCO and downstream ontologies should test SHACL conformance #504

Closed ajnelson-nist closed 1 year ago

ajnelson-nist commented 1 year ago

Disclaimer

Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

Background

This proposal is proposed as a fast-track proposal, being about revising the development and testing infrastructure. There is a potential point of discussion that could necessitate Requirements Review noted in the Risks section. Please, if you feel this needs further consideration, note so as a comment.

Section C of the SHACL specification defines a shapes graph to validate SHACL shapes graphs---dubbed "SHACL-SHACL"---as conformant with the SHACL core specification. The SHACL review tool used in UCO's CI, pySHACL, provides a run-time option to use the SHACL-SHACL graph to review the input shapes graph before using the input shapes graph to validate the data (and ontology) graph(s). This proposal suggests enabling that flag during the build-review process.

This proposal should be considered scoped to CASE as well as UCO.

Requirements

Requirement 1

UCO, as a SHACL-based ontology, must remain conformant with the SHACL core specification.

Requirement 2

UCO must integrate a SHACL conformance checking system into its Continuous Integration (CI) process.

Risk / Benefit analysis

Benefits

Risks

The submitter believes development risk is lowered with CI integration of a conformance review system, as implementation errors will be caught earlier.

However, the choice of where the SHACL-SHACL tests are applied could be considered a risk factor. The first draft implementation of this proposal applies the SHACL-SHACL review to the transitive closure of UCO---that is, UCO and all imported ontologies, which at this time is the Collections Ontology and its imported Error Ontology. If this is the chosen application strategy, UCO would impose a requirement that ontologies it imports also conform to SHACL-SHACL.

Competencies demonstrated

Competency 1

(This competency is provided to illustrate an issue found with another published ontology. No competency questions are included.)

A proposal includes a new shape (NodeShape) that almost links to another shape (PropertyShape):

@prefix core: <https://ontology.unifiedcyberontology.org/uco/core/> .
@prefix ex: <urn:example:ontology/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

ex:node-shape-1
    a sh:NodeShape ;
    # First whoops: Undefined IRI.
    sh:property ex:property-shape-2 ;
    # Second whoops: No selector (sh:target...).
    .

ex:property-shape-1
    a sh:PropertyShape ;
    sh:nodeKind sh:Literal ;
    sh:path core:name ;
    .

A sample, pretty minimal knowledge graph is provided as an expected-fail test case:

@prefix core: <https://ontology.unifiedcyberontology.org/uco/core/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[]
    a owl:Thing ;
    core:name [
        rdfs:label "Anonymous thing"@en ;
    ] ;
    .

The ground truth on whether this validation meets the spirit of the test will work or not is that it won't, instead giving an XPASS (expected FAIL, but returned PASS, so overall test-fail). Running pyshacl (version 0.20.0 at the time of this writing) without the --metashacl flag returns this:

Validation Report
Conforms: True

With the --metashacl flag, we are presented with one of the errors (admittedly, it's unclear to the proposer whether this is supposed to print twice):

Shacl File does not validate against the Shacl Shapes Shacl file.
Validation Report
Conforms: False
Results (1):
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
    Severity: sh:Violation
    Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:or ( shsh:PathShape [ sh:nodeKind sh:IRI ] ) ; sh:path sh:path ]
    Focus Node: ex:property-shape-2
    Result Path: sh:path
    Message: Less than 1 values on ex:property-shape-2->sh:path

Validator encountered a Runtime Error:
Shacl File does not validate against the Shacl Shapes Shacl file.
Validation Report
Conforms: False
Results (1):
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
    Severity: sh:Violation
    Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:or ( shsh:PathShape [ sh:nodeKind sh:IRI ] ) ; sh:path sh:path ]
    Focus Node: ex:property-shape-2
    Result Path: sh:path
    Message: Less than 1 values on ex:property-shape-2->sh:path

If you believe this is a bug in pyshacl, open an Issue on the pyshacl github page.

Fixing the typo still leaves us with an XPASS, because of the missing selector. But, the shape is raised for manual review, increasing the chance of discovery of the logic error.

Competency 2

A proposal includes new shapes:

@prefix core: <https://ontology.unifiedcyberontology.org/uco/core/> .
@prefix ex: <urn:example:ontology/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:shape-1
    a sh:Shape ;
    sh:class core:UcoObject ;
    sh:targetSubjectsOf core:name ;
    .

ex:shape-2
    a sh:PropertyShape ;
    sh:datatype xsd:string ;
    sh:targetObjectsOf core:name ;
    .

Competency Question 2.1

Are these shapes valid? How do we find out with SHACL-SHACL?

Result 2.1

SHACL property shapes are required to have an sh:path property assignment. So, ex:shape-2 is invalid. As much is reported by pyshacl when trying to use this graph, with or without the --metashacl argument:

Validator encountered a Shape Load Error: A shape defined as a PropertyShape must be the subject of a 'sh:path' predicate. For reference, see https://www.w3.org/TR/shacl/#property-shapes

pyshacl catches some classes of SHACL errors. However, as shown in competency one, some elude detection, and it's not clear whether missing those are bugs. (After all, the NodeShape in competency 1 isn't actually configured to apply to anything until, say, it's referenced with sh:node in an external scope.)

However, ex:shape-1 will work (as in pass SHACL-SHACL validation, and as in validate the data graph), even though it is not declared as a sh:NodeShape. The SHACL-SHACL shape shsh:ShapeShape defines behaviors general to sh:NodeShapes and sh:PropertyShapes, and enforces the sh:Shape conforms to either sh:NodeShape or sh:PropertyShape with this snippet:

shsh:ShapeShape
    # ...

    # Shapes are either node shapes or property shapes
    sh:xone ( shsh:NodeShapeShape shsh:PropertyShapeShape ) ;

    # ...
    .

Solution suggestion

Coordination