ucoProject / UCO

This repository is for development of the Unified Cyber Ontology.
Apache License 2.0
76 stars 34 forks source link

Versioning with versionIRI #437

Closed ajnelson-nist closed 1 year ago

ajnelson-nist commented 2 years ago

Background

A ontology versioning scheme for CASE and UCO was previously devised, but not yet fully deployed, with discussion anchored around Jira ticket ONT-64. The scheme adapts UCO's current unversioned IRI structure for ontology IRIs, e.g.:

https://ontology.unifiedcyberontology.org/uco/action

by including a version string as a suffix:

https://ontology.unifiedcyberontology.org/uco/action/0.9.0

This IRI form is usable in the owl:versionIRI declaration of an ontology.

The scheme also proposes using that versioned IRI as a prefix for concepts within the sub-ontology, e.g.:

https://ontology.unifiedcyberontology.org/uco/action/0.9.0/Action

It was thought that all that would be needed for deployment would be deciding how to handle the versions of individual ontologies (e.g. the action ontology vs. core ontology), especially if one of the ontologies didn't change between UCO releases. There are arguments for either choice.

Since that scheme was originally drafted, a few points have been realized.

  1. The owl:versionIRI mechanism is decoupled from the ontology IRI and owl:versionInfo. Nothing formally ties the versionIRI to either datum.
  2. The owl:versionIRI and owl:ontologyIRI have no formal shared responsibility over concept ownership, nor does a concept using one of the IRIs as a prefix have anything to do with whether the concept would exist with the other. That is, that IRI above ending .../action/0.9.0/Action would not induce the existence of .../action/Action, nor vice versa.
  3. It is possible to review an ontology's import closure for multiple versions of the same ontology, and/or designations of ontologies being incompatible with one another. The OWL 2 Syntax document, Sections 3.1 and 3.3, carries "Should not"-strength enforcement on these potential points of confusion, rather than "Must not."

From especially points 1 and 2, it now appears best practice is to recommend adopters use the unversioned ontology IRI (that is, the subject of the x a owl:Ontology triple) when devising concept prefixes, and to import the versioned ontology IRI with owl:imports. This is consistent with many other ontologies. Take for example SKOS, where concepts use the prefix http://www.w3.org/2004/02/skos/core#, but a owl:versionIRI is made especially for OWL (1) DL consumers to import: http://www.w3.org/TR/skos-reference/skos-owl1-dl.rdf.

Separately, experience with drafting the documentation deployment system lends to the conclusion that maintaining independent sub-ontology versions is likely to lead to significant confusion. In the UCO documentation, navigating to a IRI from sub-ontology X, from a sub-ontology Y with a later version, would make it impossible to navigate back to Y without resorting to the browser back button.

Requirements

Requirement 1

UCO and CASE should use a versioning mechanism defined by OWL 2 rather than attempt an independent versioning mechanism.

Requirement 2

Version practice for UCO should be that each owl:Ontology of UCO bump its versionInfo to match uco.ttl. Documentation navigation otherwise becomes a maze with non-reversible navigation.

Requirement 2.1

A scope clarification: The "Importer" UCO ontologies that define SHACL shapes for external ontologies should also use owl:versionInfo values matching uco.ttl.

Requirement 3

owl:priorVersion, owl:backwardCompatibleWith, and owl:incompatibleWith declarations should be used when declaring new versions.

Requirement 4

UCO adopters must be able to designate what version of UCO they use. This applies for "TBox"-level adoption (i.e. ontologies defining classes and properties) and "ABox"-level adoption (i.e. knowledge bases defining individuals and facts about individuals).

Requirement 5

core:specVersion should be considered for removal, as it appears to have served the purpose of a stop-gap version identifier.

Risk / Benefit analysis

Benefits

Risks

Competencies demonstrated

Competency 1

An investigation is run in isolation from other graphs, and stored in a knowledge base encoded as an ontology, anchored with this statement:

<urn:example:investigation-bb62979e-b719-4a1b-af36-e7b92be48eb6> 
    a owl:Ontology .

Competency Question 1.1

What version(s) of CASE was used in that investigation?

Result 1.1

SELECT DISTINCT ?nCASEVersion
WHERE {
    <urn:example:investigation-bb62979e-b719-4a1b-af36-e7b92be48eb6> 
        owl:imports+/owl:versionIRI? ?nCASEVersion .
    <https://ontology.caseontology.org/case/case>
        owl:versionIRI ?nCASEVersion .
}

Competency 2

A tool is declaring the version of UCO it uses in its output. A user requests the tool generate data for the knowledge base with IRI http://example.org/kb.

Competency Question 2.1

How should the tool emit its target UCO version?

Result 2.1

The tool can emit these triples as a floating statement alongside its other graph output:

<http://example.org/kb>
    a owl:Ontology ;
    owl:imports <https://ontology.unifiedcyberontology.org/uco/uco/0.9.0> .

When the tool's outputs are staged for aggregation into the knowledge base, the owl:imports statement can be used to decide on whether the UCO version used is compatible with the whole of the knowledge base.

Solution suggestion

Coordination

ajnelson-nist commented 2 years ago

A draft implementation of versionIRI support has been added, including extensive QC tests and some fixes to the first, untested drafts of versionIRI parsing in OWL.

The results of the process of adding the versionIRI concept is demonstrated by making a "release"---not actually stamped yet, but I request it is on an approving vote for Issue 437---of 0.9.1. This is done in this branch (archived on UCO-Archive in case we find reason to scrap the commits for better per-feature documentation):

https://github.com/ucoProject/UCO-Archive/tree/archive/Feature-Issue-437-v1-0.9.1

The new features of 0.9.1 include OWL 2 DL conformant usage of owl:versionIRI, which necessitates each ontology get an owl:ontologyIRI. (The owl:ontologyIRIs turn out to be reflexive.) A new test suite, implemented as SHACL shapes not designed for re-consumption outside of UCO's management process, is applied as part of the CI process.

0.9.1 institutes a mechanical constraint: The owl:versionInfo of UCO's uco.ttl "master"/"root" ontology file has, to date, only held the numeric version string. The QC shapes now use that version string to test that all UCO ontology version IRIs are in sync with one another.

The results of the process of incrementing the versionIRI, as would be done with each release, is demonstrated in this branch:

https://github.com/ucoProject/UCO-Archive/tree/archive/Feature-Issue-437-v1-1.0.0

An addition of owl:priorVersion adds an opportunity to test that we say what kind of "prior" relationship we have - backwards compatible, or backwards incompatible. This is a good workflow point for the OC Chair to evaluate whether new proposals can go into the next SEMVER minor release, or the next major release.

Feedback is encouraged on the patches. There will likely be some more testing done by tomorrow morning's meeting, but the major design features I'd planned are in place.

ajnelson-nist commented 2 years ago

I will likely get to testing this feature for CASE tomorrow after the OCs meeting.

ajnelson-nist commented 1 year ago

@eoghanscasey , @sbarnum : I just used Github to request reviews from you on the six PRs remaining unchecked in the checklist. Yes, six, because this is handling two Git-flow-esque "Release" merges across UCO and CASE (for their 0.9.1 and 0.7.1, respectively), as well as implementing owl:priorVersion as part of the 1.0.0 releases for UCO and CASE.

There are several tests added to confirm the usage of OWL ontology versioning features is (1) conformant with OWL in general, which mostly goes no stronger than "SHOULD"-level statements, and (2) conformant with more stringent CDO-scoped requirements for the sake of versioning quality control. Note that the CDO-specific shapes use the namespace urn:example:... namespaces as one way of indicating they are not meant for use outside of CDO review.

The tests currently are implemented as failing CI if anything even triggers a warning. This is not a practice I intend to defend, but it's more of a quality canary for our adoption of future ontologies or other OWL mechanics. I'm happy with --allow-warnings to be restored on the pyshacl when we find basically any use case.

You might also note a few "Editorial"-level changes were implemented, mainly IIRC about cutting extra owl:imports that were not necessary because of the transitive closure of what was already being imported.

In response to a bit of feedback I got this morning, inquiring why all of the versions of all UCO ontologies in this repository are locked to one value, the uco.ttl versionInfo: Our documentation engine cannot support both a total presentation of UCO as well as independent subontology versions that are imported by multiple "Future" versions. Also, in OWL, only an ontology gets a version. Individual concepts don't. And as I mentioned early on in this Issue posting, the IRI spelling of owl:versionIRI has no mechanical tie to its corresponding ontology's spelling, so no OWL-standard mechanism exists to support versioning of an individual concept except for housing that concept in its own standalone ontology. I suggest not going there.

With all that said, feedback is welcome, but I intend to move fast on cutting the release.

ajnelson-nist commented 1 year ago

Clarifications: