ucoProject / UCO

This repository is for development of the Unified Cyber Ontology.
Apache License 2.0
76 stars 34 forks source link

Authoritative IRI prefixes #457

Open ajnelson-nist opened 2 years ago

ajnelson-nist commented 2 years ago

Background

RDF graphs tend to include a mechanism for declaring namespace prefixes. JSON-LD context dictionaries are one way of specifying a set of prefixes to be used within individual graph files.

@kfairbanks, in developing context dictionaries for UCO, has had to make an assumption about how to pick the prefixes for namespaces (i.e. uco-action for https://ontology.unifiedcyberontology.org/uco/action/). This means a decision implemented in the scope of JSON-LD context construction is the authoritative set of prefixes for UCO JSON and JSON-LD users. This feels (to the submitter) like not quite the right scope, and UCO should instead make an encoding on the ontologies as the authoritative prefix.

By luck, each of the ontologies where this is relevant (i.e. supplies a class, property, or datatype) already kind of has that encoding, in an rdfs:label. However, rdfs:label doesn't quite fit the needs:

To support at least the context dictionaries, UCO should add one or both of these features to each ontology in UCO (and likewise for CASE):

  1. A skos:notation that contains the authoritative prefix, datatyped xsd:string.
  2. A SHACL sh:declare is another mechanism that enables a prefix declaration. However, by a consequence of the SHACL spec, this appears to be ... "virally" authoritative.

The scope of a SHACL sh:PrefixDeclaration expands beyond the ontology file in which it's declared, and is instead the entirety of importers' OWL transitive closure. SHACL Spec, 5.2.1:

"A SHACL processor collects a set of prefix mappings as the union of all individual prefix mappings that are values of the SPARQL property path sh:prefixes/owl:imports*/sh:declare of the SPARQL-based constraint or validator. If such a collection of prefix declarations contains multiple namespaces for the same value of sh:prefix, then the shapes graph is ill-formed."

sh:PrefixDeclarations are optional mechanisms that assist with SHACL-SPARQL. It is possible to avoid their use, and instead use PREFIX SPARQL statements in any SHACL-SPARQL constraints.

The proposed UCO MIME Taxonomy includes SHACL shapes that enact constraints on skos:notation. Interpretations of SHACL severity are based on informal language from the SKOS Reference document. From [Section 6.5.2], it appears there should only be one skos:notation on a concept per datatype on the notation's literal---but, the descriptive language falls short of making this a "MUST"-level requirement.

Requirements

Requirement 1

CDO ontologies (that is, each owl:Ontology) must each declare a namespace prefix for its usage.

Requirement 2

CDO ontologies must not create sh:PrefixDeclarations for any namespaces over which they do not have authority (i.e. not having domain *.unifiedcyberontology.org, *.caseontology.org, etc.).

Risk / Benefit analysis

Benefits

  1. skos:notation is meant to bear only one value per datatype. A. However, this is not a "MUST"-level requirement.
  2. sh:PrefixDeclaration makes an expansive stake to a prefix among downstream adopters of UCO. The expansiveness of this greatly assists with UCO settling the question of whether its prefixes could eschew the "uco-" pre-prefix: No, they shouldn't, because of conflicts with existing ontologies (such as UCO's time vs. OWL time).
  3. sh:PrefixDeclaration mimics the effect of needing to do chaining imports of JSON-LD context dictionaries.
  4. sh:PrefixDeclaration and skos:notation embrace mechanisms from existing standards, saving on design. Combining the two leads to a stronger UCO review mechanism (illustrated in Competency Question 1.2 below).

Risks

  1. Usage of skos:notation could entail UCO needing a skos:ConceptScheme in uco.ttl. This ConceptScheme would have each UCO-developed ontology as a member. A. While this would give the benefit of a SKOS-based uniqueness test, it would be an additional piece of technical debt passed to downstream ontologies (e.g. CASE) that would need their own ConceptScheme, largely duplicative, due to scope of authority.
  2. Adoption of a single SKOS concept does not necessarily entail that we need to import all of SKOS. Instead, the SHACL shape pertaining to skos:notation, currently housed in the UCO MIME Taxonomy, can be brought into UCO, without necessitating a owl:import of all of SKOS.
  3. Per SHACL Spec section 5.2.1, adoption of sh:declare would make any importer (using owl:imports) of UCO need to not use any prefix UCO uses for a different namespace. This led to Requirement 2.

A risk not necessarily scoped to this proposal is there is now another review step for using owl:imports statements: The OWL transitive closure needs to be reviewed for sh:declares causing conflicting prefixes. Theoretically SHACL-SHACL (SHACL used to review a SHACL graph) would handle this review of a monolithic ontology build.

Competencies demonstrated

Competency 1

A knowledge base has imported an ontology that imported CASE (which imports UCO). The knowledge base includes these statements:

<http://example.org/kb/>
    a owl:Ontology ;
    owl:imports <https://ontology.caseontology.org/case/case> ;
    sh:declare [
        sh:prefix "kb" ;
        sh:namespace "http://example.org/kb/"^^xsd:anyURI ;
    ] ;
    skos:notation "kb" ;
    .

Competency Question 1.1

A user is interested in knowing what ontology prefixes are in the knowledge base.

SELECT DISTINCT ?lOntologyPrefix ?lOntologyNamespace
WHERE {
?nPrefixDeclaration
  sh:prefix ?lOntologyPrefix ;
  sh:namespace ?lOntologyNamespace ;
  .
}
ORDER BY ?lOntologyPrefix ?lOntologyIRI

Result 1.1

With the current state of UCO's develop, these would be the results:

?lOntologyPrefix ?lOntologyNamespace
kb http://example.org/kb/
owl http://www.w3.org/2002/07/owl#
sh http://www.w3.org/ns/shacl#

As a development aid, sh:prefix was used in the OWL SHACL review mechanism. This proposal would nix those shapes.

Competency Question 1.2

A user is interested in knowing what ontology prefixes are defined authoritatively in the knowledge base, by merit of having a skos:notation matching a prefix.

SELECT DISTINCT ?lOntologyPrefix ?nOntology ?lOntologyNamespace
WHERE {
?nOntology
  skos:notation ?lOntologyPrefix ;
  .
?nPrefixDeclaration
  sh:prefix ?lOntologyPrefix ;
  sh:namespace ?lOntologyNamespace ;
  .
}
ORDER BY ?lOntologyPrefix ?nOntology ?lOntologyIRI

Result 1.2

With the current state of UCO's develop, these would be the results:

?lOntologyPrefix ?nOntology ?lOntologyNamespace
kb http://example.org/kb http://example.org/kb/

(Note also the difference in trailing slash.)

On adoption of this proposal, these could be the results:

?lOntologyPrefix ?nOntology ?lOntologyNamespace
kb http://example.org/kb http://example.org/kb/
uco-action https://ontology.unifiedcyberontology.org/uco/action https://ontology.unifiedcyberontology.org/uco/action/
uco-co https://ontology.unifiedcyberontology.org/co https://ontology.unifiedcyberontology.org/co/
... ... ...
uco-owl https://ontology.unifiedcyberontology.org/owl https://ontology.unifiedcyberontology.org/owl/

Note that namespaces that do not provide concepts (classes, properties, or datatypes) currently do not seem like they would need a prefix declared. Hence, uco-master would not be given a sh:PrefixDeclaration. uco-co and uco-owl also do not provide concepts, but instead provide only shapes for existing concepts, so it's debatable whether they should be given a sh:PrefixDeclaration.

Solution suggestion

Coordination