ucoProject / UCO

This repository is for development of the Unified Cyber Ontology.
Apache License 2.0
76 stars 34 forks source link

Add time properties to PDF File Facet #421

Closed ajnelson-nist closed 2 years ago

ajnelson-nist commented 2 years ago

Background

UCO has the ability to represent certain timestamps based on sources like file systems or other metadata fields (e.g. from email). (See #420.) PDFs contain their own specialized timestamp fields, but UCO has no way of representing them yet.

The PDF specification provides for these time stamps:

Requirements

Requirement 1

UCO must be able to represent the internally embedded creation time of a PDF file.

Requirement 2

UCO must be able to represent the internally embedded modification time of a PDF file.

Requirement 3

UCO must be able to represent the internally embedded access time of a PDF file.

Risk / Benefit analysis

Benefits

UCO would improve its timeline capabilities by representing these times.

Risks

There is a name pattern discrepancy between UCO's current observable:modifiedTime and observable:createdTime, versus the names in the specification. One name pattern must be selected, or the PDF specification should be deferred to.

Otherwise, this is an additive proposal.

Competencies demonstrated

Competency 1

Suppose a PDF is found to have both a creation and modification timestamp embedded.

Competency Question 1.1

Do the creation and modification time follow a sensible temporal ordering - modification occurs at or after creation time?

Ask for all PDF files in our knowledge base that have a creation time that comes definitively after the modification time.

SELECT ?nPdfFile
WHERE {
  ?nPdfFile
    a uco-observable:PDFFile ;
    uco-core:hasFacet ?nPdfFileFacet ;
    .

  ?nPdfFileFacet
    a uco-observable:PDFFileFacet ;
    drafting:pdfCreatedDate ?lCrDate ;
    drafting:pdfModDate ?lModDate ;
    .

  FILTER ( ?lCrDate > ?lModDate )
}

Result 1.1

Solution suggestion

To the observable:PDFFileFacet

Coordination

ajnelson-nist commented 2 years ago

When @eoghanscasey and I were discussing how to name these two properties, we had agreed between us to use the name found in the PDF specification. He had shown me PDF v1.7.

Per that specification, the name we should use is pdfCreationDate, not pdfCreatedDate. See e.g. Table 317 of this document:

https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf

found via the Library of Congress format page for PDF, Version 1.7, ISO 32000-1:2008.

CreationDate appears in the PDF specification, not CreatedDate. The implementation of the UCO property will use CreationDate.