ucoProject / UCO

This repository is for development of the Unified Cyber Ontology.
Apache License 2.0
78 stars 34 forks source link

EmailMessageFacet should allow 0 values for isMimeEncoded and isMultipart #383

Closed ajnelson-nist closed 2 years ago

ajnelson-nist commented 2 years ago

Background

Issue 380 included a few examples of reasons to generate "Stub" EmailMessage graph objects, as candidate objects to use to construct reply chains. If one tried to validate those examples, they would get sh:Violation results from SHACL, that they lack two properties: observable:isMimeEncoded and observable:isMultipart.

It appears that there was a logic error when UCO 0.7.0 did its conversion from OWL to SHACL. Those two properties are rightly described in OWL as having a cardinality of exactly 1 - an email instance has exactly one value for isMimeEncoded. It just might not be yet recorded in a graph. From OWL's perspective, it is fine for a graph to record 0 values for isMimeEncoded, or 1, just not 2. (And likewise for isMultipart.)

The SHACL conversion likely took this "Exactly 1" cardinality and turned that into a too-stringent data requirement: A graph must recorded a value for isMimeEncoded. (And likewise for isMultipart.)

Where this becomes a problem is, if an EmailMessage has to be generated as a placeholder for whatever reason, then SHACL validation fails unless some value is asserted for both of those properties. This is not a correct requirement for UCO to put on applications, because those values is not inherently knowable without the email artifact in hand.

Requirements

Requirement 1

Requirement 2

Risk / Benefit analysis

Benefits

Requirements 1 and 2 fit with a general usage pattern that it should be acceptable to define certain Facets even if that Facet houses no properties. This usage pattern applies in the cases where a "Stub" object needs to be defined, and for whatever reason the Facet might need to be defined (e.g. to store a graph identifier for downstream processes).

No other properties in EmailMessageFacet have a sh:minCount, so it seems especially odd for these two to have a non-0 minimum.

Risks

The submitter believes there are no risks associated with this change.

Competencies demonstrated

Competency 1

Issue 380 demonstrated a use case for generating EmailMessage objects with only one in-Facet property known, deriving an object from an In-Reply-To header.

Competency Question 1.1

Can an EmailMessage defined housing only messageID?

Result 1.1

With this proposal, yes.

Solution suggestion

The solution is provided in PR 384.

Coordination

sbarnum commented 2 years ago

I have no objections to removing the min constraint on these two properties and making them optional.

That being said, I believe the explanation and justification in the CP should be corrected and clarified. This is a requested change to the cardinality of these two properties to support newly identified use cases. The current SHACL constraints are not a OWL->SHACL conversion error between 0.6.0 and 0.7.0. In 0.6.0, the constraints on these properties utilized a qualifiedCardinality of 1. This means that there must be exactly one instance of each property on an EmailMessageFacet class. This is semantically identical to the current SHACL constraints of minCount=1 and maxCount=1.

ajnelson-nist commented 2 years ago

In 0.6.0, the constraints on these properties utilized a qualifiedCardinality of 1. This means that there must be exactly one instance of each property on an EmailMessageFacet class. This is semantically identical to the current SHACL constraints of minCount=1 and maxCount=1.

@sbarnum , I believe you are incorrect on this point.

Cardinality means something different in OWL versus in SHACL.

Cardinality 1 in OWL means there exists, uniquely, one value for the property. It does not mean that value must be specified in the data - it might be deducible from an inferencing engine. Also, if there are two values, cardinality 1 means those two values are further interpreted to be the same. (This tends to only make sense on object references, though there might be a weird case this can happen with a RDF Literal value.)

Cardinality 1 in SHACL means there must be exactly one value, and if your data doesn't have exactly one recorded, you are in violation of a shape constraint.

So, cardinality of 1 in OWL translates to permitting 0 or 1 values in SHACL.