Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.
Background
In the 2023-04-18 Ontology Committees meeting, the OCs discussed Issue 534, which in brief is about whether three observable:ObservableObject subclasses that are currently unrelated to one another could be used together to represent downloading a file from a URL with an expectation of certain hashes being computable.
One of the points that came out of the discussion was a general agreement that observable:File and observable:URL should be disjoint classes.
No commentary was made on how observable:ContentData relates or doesn't relate to either of those classes.
There also was not a suggestion on whether there is a superclass of observable:File or observable:URL that would be a more appropriate disjointedness target. But, the belief is that this specific disjointedness designation would be compatible with future modeling refinements.
Requirements
Requirement 1
UCO must prevent a user from designating a node as both an observable:File and observable:URL.
Risk / Benefit analysis
Benefits
This aligns with Ontology Committee members' intutions.
Risks
New disjointedness designations would need to be added as SHACL shapes with sh:Warning severity for UCO 1.x.0, and could be designated sh:Violation severity only in a future major release. This is believed low-risk, as the practice is being exercised in other proposals currently.
This restriction on typing does nothing to resolve whether it is appropriate to continue duck-typing an individual node as like a file and like a URL by giving the node a observable:FileFacet and observable:URLFacet. UCO Facets still permit this, and no policy in English, OWL, or SHACL disallows it.
This proposal sidesteps the original question of how to associate "Expected" hashes with a URL that is expected to provide a file.
This restriction lacks modeling rationale stated beyond the OCs' intuition. The discussion in the meeting included asides like "A URL is more an address, or locator, which a file isn't." While this aligns with intuition, for reasons unclear to the proposer, observable:URL is not currently a subclass of observable:Address. Was this an oversight? If so, is it appropriate to add to UCO these statements: observable:Address owl:disjointWith observable:File . and observable:URL rdfs:subClassOf observable:Address .?
Competencies demonstrated
Competency 1
A user is trying to represent a downloadable file. (This is compiled and excerpted from the same example data in #534.)
In UCO 1.2.0, yes this is conformant; but per this proposal, no, it should not be, because the URL should not be considered to be a file. This situation is flaggable with this constraint being added to observable:File:
(That constraint would work, but in an oversimplified manner; the solution description section provides a fuller implementation and rationale.)
Competency Question 1.2
Before any download action takes place from that files.pythonhosted.org URL, what is the association between the hash daf617d... and the URL https://files.pythonhosted.org/packages/d4/f9/28260b...?
Result 1.2
The answer to this question is out of scope of this proposal.
Suggestions are welcome, but likely need to be part of future proposal(s). The proposer has in mind a potential solution based on Qualities that might also be of interest to the Adversary Engagement Ontology.
Solution suggestion
First, designate with OWL that observable:File and observable:URL are disjoint by adding this one triple:
Then, a new shape specialized to the pairwise disjointedness of observable:File and observable:URL:
observable:File-disjointWith-URL-shape
a sh:NodeShape ;
sh:message "observable:File and observable:URL are disjoint classes."@en ;
sh:not [
a sh:NodeShape ;
sh:class observable:URL ;
] ;
sh:targetClass observable:File ;
.
Solution discussion
The reasons for adding a shape specialized to the pair are for (1) shape performance, and (2) deprecation management.
First, on shape performance: It is possible to use a general-purpose "Find all disjoint-set members" SPARQL query that would work across all OWL usage. One has been used in CASE-Corpora for some months, defined here, and it has assisted with finding modeling errors by only needing a sole owl:disjointWith statement to be added to an ontology. However, to use that shape, some degree of inferencing (/graph expansion) is required, either RDFS- or OWL-based. And further, this is reliant on a SPARQL engine's performance capabilities.
Second, on deprecation management: Recently, CDO shapes repositories have been begun to explore potential concurrent usage of other ontologies with UCO. The Friend-of-a-Friend shapes repository, used in the UCO FOAF Profile, handles these disjointedness statements, which are all of the disjointWith occurrences in FOAF:
Note that not all the classes mentioned are disjoint with all of the other classes. For instance, it is conformant with FOAF to have a node that is both a foaf:Organization and foaf:Project, despite both those classes being disjoint with foaf:Document.
An initial draft of the shape to represent Documents being disjoint with Organizations and Projects looked like this:
sh-foaf:Document-disjointedness-shape
a sh:NodeShape ;
sh:message "foaf:Document is a disjoint class with foaf:Organization and foaf:Project."@en ;
sh:not [
a sh:NodeShape ;
sh:or (
[
a sh:NodeShape ;
sh:class foaf:Organization ;
]
[
a sh:NodeShape ;
sh:class foaf:Project ;
]
) ;
] ;
sh:targetClass foaf:Document ;
.
(The nested sh:or is because SHACL requires that a single sh:NodeShape not have two values of sh:not.)
Instead, these shapes were implemented, copied here:
sh-foaf:Document-disjointWith-Organization-shape
a sh:NodeShape ;
sh:message "foaf:Document and foaf:Organization are disjoint classes."@en ;
sh:not [
a sh:NodeShape ;
sh:class foaf:Organization ;
] ;
sh:targetClass foaf:Document ;
.
sh-foaf:Document-disjointWith-Project-shape
a sh:NodeShape ;
sh:message "foaf:Document and foaf:Project are disjoint classes."@en ;
sh:not [
a sh:NodeShape ;
sh:class foaf:Project ;
] ;
sh:targetClass foaf:Document ;
.
The reasons were:
The shape with sh:or ends up obscuring which of the classes, Organization or Project, triggered the violation. An sh:message cannot be fruitfully embedded deeper in the sh:or tree. That is, the deeper message does not display in the SHACL validation report. (This was, at least, the proposer's experience using pyshacl. A functioning demonstration rebutting this belief of SHACL incapability is welcome.) So for specificity, the specialized shapes were implemented.
Related to not being able to nest sh:message: Piling the sh:not into a general shape targeting the class (such as UCO does) would leave disjointedness violations having a description message that is basically a repetition of the Turtle-encoded SHACL. It is likely to be a better user experience to provide a short natural-language sentence, especially versus a sh:not, around a sh:or, around several shapes describing classes and possibly complements of classes.
The specialized shapes also enable documenting deprecation for the pair's disjointedness (i.e., making it OK for a node to be both classes again) at an IRI. It is documentable with an rdfs:comment if using the sh:or style, but that comment would not be necessary to record. A specialized IRI would at least leave the IRI in place (per UCO policy on retaining IRIs), which could explicitly do nothing. It's a fair debate which is "better" style, but the proposer believes separate IRIs is more compatible with UCO policy and historic record-keeping.
Disclaimer
Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.
Background
In the 2023-04-18 Ontology Committees meeting, the OCs discussed Issue 534, which in brief is about whether three
observable:ObservableObject
subclasses that are currently unrelated to one another could be used together to represent downloading a file from a URL with an expectation of certain hashes being computable.One of the points that came out of the discussion was a general agreement that
observable:File
andobservable:URL
should be disjoint classes.No commentary was made on how
observable:ContentData
relates or doesn't relate to either of those classes.There also was not a suggestion on whether there is a superclass of
observable:File
orobservable:URL
that would be a more appropriate disjointedness target. But, the belief is that this specific disjointedness designation would be compatible with future modeling refinements.Requirements
Requirement 1
UCO must prevent a user from designating a node as both an
observable:File
andobservable:URL
.Risk / Benefit analysis
Benefits
Risks
sh:Warning
severity for UCO 1.x.0, and could be designatedsh:Violation
severity only in a future major release. This is believed low-risk, as the practice is being exercised in other proposals currently.observable:FileFacet
andobservable:URLFacet
. UCOFacet
s still permit this, and no policy in English, OWL, or SHACL disallows it.observable:URL
is not currently a subclass ofobservable:Address
. Was this an oversight? If so, is it appropriate to add to UCO these statements:observable:Address owl:disjointWith observable:File .
andobservable:URL rdfs:subClassOf observable:Address .
?Competencies demonstrated
Competency 1
A user is trying to represent a downloadable file. (This is compiled and excerpted from the same example data in #534.)
Competency Question 1.1
Is this conformant UCO data? Should it be?
Result 1.1
In UCO 1.2.0, yes this is conformant; but per this proposal, no, it should not be, because the URL should not be considered to be a file. This situation is flaggable with this constraint being added to
observable:File
:(That constraint would work, but in an oversimplified manner; the solution description section provides a fuller implementation and rationale.)
Competency Question 1.2
Before any download action takes place from that
files.pythonhosted.org
URL, what is the association between the hashdaf617d...
and the URLhttps://files.pythonhosted.org/packages/d4/f9/28260b...
?Result 1.2
The answer to this question is out of scope of this proposal.
Suggestions are welcome, but likely need to be part of future proposal(s). The proposer has in mind a potential solution based on Qualities that might also be of interest to the Adversary Engagement Ontology.
Solution suggestion
First, designate with OWL that
observable:File
andobservable:URL
are disjoint by adding this one triple:Then, a new shape specialized to the pairwise disjointedness of
observable:File
andobservable:URL
:Solution discussion
The reasons for adding a shape specialized to the pair are for (1) shape performance, and (2) deprecation management.
First, on shape performance: It is possible to use a general-purpose "Find all disjoint-set members" SPARQL query that would work across all OWL usage. One has been used in CASE-Corpora for some months, defined here, and it has assisted with finding modeling errors by only needing a sole
owl:disjointWith
statement to be added to an ontology. However, to use that shape, some degree of inferencing (/graph expansion) is required, either RDFS- or OWL-based. And further, this is reliant on a SPARQL engine's performance capabilities.Second, on deprecation management: Recently, CDO shapes repositories have been begun to explore potential concurrent usage of other ontologies with UCO. The Friend-of-a-Friend shapes repository, used in the UCO FOAF Profile, handles these disjointedness statements, which are all of the
disjointWith
occurrences in FOAF:Note that not all the classes mentioned are disjoint with all of the other classes. For instance, it is conformant with FOAF to have a node that is both a
foaf:Organization
andfoaf:Project
, despite both those classes being disjoint withfoaf:Document
.An initial draft of the shape to represent
Document
s being disjoint withOrganization
s andProject
s looked like this:(The nested
sh:or
is because SHACL requires that a singlesh:NodeShape
not have two values ofsh:not
.)Instead, these shapes were implemented, copied here:
The reasons were:
sh:or
ends up obscuring which of the classes,Organization
orProject
, triggered the violation. Ansh:message
cannot be fruitfully embedded deeper in thesh:or
tree. That is, the deeper message does not display in the SHACL validation report. (This was, at least, the proposer's experience usingpyshacl
. A functioning demonstration rebutting this belief of SHACL incapability is welcome.) So for specificity, the specialized shapes were implemented.sh:message
: Piling thesh:not
into a general shape targeting the class (such as UCO does) would leave disjointedness violations having a description message that is basically a repetition of the Turtle-encoded SHACL. It is likely to be a better user experience to provide a short natural-language sentence, especially versus ash:not
, around ash:or
, around several shapes describing classes and possibly complements of classes.rdfs:comment
if using thesh:or
style, but that comment would not be necessary to record. A specialized IRI would at least leave the IRI in place (per UCO policy on retaining IRIs), which could explicitly do nothing. It's a fair debate which is "better" style, but the proposer believes separate IRIs is more compatible with UCO policy and historic record-keeping.In summary, these will be added:
observable:File owl:disjointWith observable:URL .
.observable:File-disjointWith-URL-shape
.sh:severity sh:Warning
level until UCO 2.0.0, unless requested for further delay.Coordination
develop
for the next releasedevelop
state with backwards-compatible implementation tracked by CASEdevelop
branch (for prerelease delivery on CASE website)develop
state with backwards-compatible implementation merged intodevelop-2.0.0
develop-2.0.0
develop-2.0.0
state with backwards-incompatible implementation tracked by CASEdevelop-2.0.0
branch (for prerelease delivery on CASE website)