casework / CASE

Cyber-investigation Analysis Standard Expression (CASE) Ontology
https://caseontology.org
Apache License 2.0
66 stars 22 forks source link

`InvestigativeAction`s should be required to produce at least one `ProvenanceRecord` #146

Open ajnelson-nist opened 10 months ago

ajnelson-nist commented 10 months ago

Background

Discussion on CASE Issue 136 suggests that an InvestigativeAction should always result in the creation of at least one ProvenanceRecord.

Requirements

Requirement 1

CASE should enforce that an InvestigativeAction results in at least one ProvenanceRecord.

As an implementation note, this would be done with a qualified SHACL constraint.

Edited 2024-02-15: "Must" relaxed to "should".

Requirement 2

CASE should describe in a mechanically discoverable way that an InvestigativeAction is expected to always result in at least one ProvenanceRecord.

As an implementation note, this would be done with a qualified minimum cardinality in an OWL Restriction.

Risk / Benefit analysis

Benefits

  1. Requiring a ProvenanceRecord always be generated induces a chain of custody tie in forensic processing for resultant objects of InvestigativeActions.
  2. Reintroduction of OWL constructs will assist with OWL-specific review mechanisms that do not appear to be possible in SHACL, such as set-satisfiability (e.g. determining through set-theoretic analysis whether a class or restriction has accidentally ended up equating to the empty set, rendering usage conformant with the specification impossible).
    1. This is acknowledged to be a broader issue than this one proposal. However, a minimum cardinality restriction appears to the submitter to be a "safe" reintroduction in terms of complexity.

Risks

  1. Existing SHACL shapes require a ProvenanceRecord always have one member UcoObject. Thus, this proposal would induce a significant requirement on InvestigativeActions: They must always result in something aside from the ProvenanceRecord.
    1. Note that an object being a result of an action does not necessarily imply that the object was created by the action. This stemmed from discussion on UCO Issue 558.
    2. It is possible the definition of ProvenanceRecord is too stringent. It is somewhat a separate concern that there might exist a class of InvestigativeActions that truly have no results. Perhaps: "This action found all files within this directory. There were none."
    3. NOTE: Risk 1 mitigated with resolution of UCO Issue 599. ProvenanceRecords may now be empty.
  2. Some Actions might be desired to be defined in a manner that attempt to restrict the results to a specific class, e.g., IP addresses. If such an action-class were introduced, it could never be an InvestigativeAction, because an InvestigativeAction would be required to include a ProvenanceRecord among its results. Hence, this proposal would end up inducing an upstream design constraint on UCO: action:result can never be constrained with owl:allValuesFrom, because UCO doesn't "know" about case-investigation:ProvenanceRecord.
  3. This proposal does not specify whether there must only be one ProvenanceRecord among the results. This is an inconclusive point from the discussion on CASE Issue 136, and could be affected depending on whether the committee decides a subaction's ProvenanceRecord should also be recorded in the parent action's results.
  4. This proposal suggests restoring OWL practices, starting with a description of at least one of the outputs for any InvestigativeAction. CASE and UCO previously abandoned OWL in UCO 0.7.0 / CASE 0.5.0. This proposal starts a disciplined reintroduction of OWL constructs, testing with the UCO-OWL syntax review mechanisms.
    1. UCO Change Proposal 23 housed discussion, though it appears that document was not exported from the access-controlled UCO Confluence space. (I don't think there is a reason it wasn't, aside from document exports only becoming a mandated part of the proposal process in later releases.)
    2. A test focused on the syntax used will be added in a separate proposal to UCO.
  5. Due to needing SHACL qualified shapes, the CASE testing infrastructure also needs to require pySHACL >= 0.24.0, which incorporates a resolution to pySHACL Issue 213.
  6. (Added 2024-02-15.) In information sharing situations, some data might be restricted from being shared or alluded to, e.g., from legally imposed redactions. If Org1 shares part of a graph with Org2, and includes some InvestigativeAction for, say, its timing and tool-use relevance, but doesn't share the identifier for the generated ProvenanceRecord, the shared data should by itself still be conformant to UCO, and should not impose UCO validation errors when folded into the receiving organization's knowledge base.

Competencies demonstrated

Competencies are omitted from this proposal, as the effects are new restrictions on data, and hence do not enable new expressive abilities.

Solution suggestion

For CASE 1.x.0, add the following to investigation.ttl:

investigation:InvestigativeAction
    rdfs:subClassOf [
        a owl:Restriction ;
        owl:onProperty uco-action:result ;
        owl:onClass investigation:ProvenanceRecord ;
        owl:minQualifiedCardinality "1"^^xsd:nonNegativeInteger ;
    ] ;
    sh:property [
        sh:message "An InvestigativeAction should have a ProvenanceRecord among its results.  This will be a requirement in CASE 2.0.0."@en ;
        sh:path uco-action:result ;
        sh:qualifiedMinCount "1"^^xsd:integer ;
        sh:qualifiedValueShape [
            a sh:NodeShape ;
            sh:class investigation:ProvenanceRecord ;
        ] ;
        sh:severity sh:Warning ;
    ] ;
    .

For CASE 2.0.0, remove the sh:message and sh:severity triples from the added sh:PropertyShape.

Coordination