ucoProject / UCO

This repository is for development of the Unified Cyber Ontology.
Apache License 2.0
73 stars 34 forks source link

Add JSON-LD context for serialized UCO content #423

Open sbarnum opened 1 year ago

sbarnum commented 1 year ago

Background

The chosen default serialization for UCO content is json-ld. json-ld spec

json-ld is an officially supported serialization within the RDF ecosystem and is losslessly transformable by rdf tools to other rdf serializations.

The default fully expanded form of json-ld is referred to as "expanded" and contains full IRIs for all class types and properties. Throughout this CP the "Device" example within the CASE examples repo will be utilized for illustrative purposes. CASE Device example.

In expanded json-ld form this example would look like this:

{
    "@graph": [
        {
            "@id": "http://example.org/kb/organization-c240cf37-0556-439b-9a51-1ca41732010d",
            "@type": "https://ontology.unifiedcyberontology.org/uco/identity/Organization",
            "https://ontology.unifiedcyberontology.org/uco/core/name": "Dell"
        },
        {
            "@id": "http://example.org/kb/organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26",
            "@type": "https://ontology.unifiedcyberontology.org/uco/identity/Organization",
            "https://ontology.unifiedcyberontology.org/uco/core/name": "Microsoft"
        },
        {
            "@id": "http://example.org/kb/forensic_lab_computer1-uuid",
            "@type": "https://ontology.unifiedcyberontology.org/uco/observable/Device",
            "http://example.org/local#location": {
                "@id": "http://example.org/kb/forensic_lab1-uuid"
            },
            "https://ontology.unifiedcyberontology.org/uco/core/hasFacet": [
                {
                    "@type": "https://ontology.unifiedcyberontology.org/uco/observable/DeviceFacet",
                    "https://ontology.unifiedcyberontology.org/uco/observable/deviceType": "Computer",
                    "https://ontology.unifiedcyberontology.org/uco/observable/manufacturer": {
                        "@id": "http://example.org/kb/organization-c240cf37-0556-439b-9a51-1ca41732010d"
                    },
                    "https://ontology.unifiedcyberontology.org/uco/observable/model": "Inspiron 5000",
                    "https://ontology.unifiedcyberontology.org/uco/observable/serialNumber": "D1234567"
                },
                {
                    "@type": "https://ontology.unifiedcyberontology.org/uco/observable/OperatingSystemFacet",
                    "https://ontology.unifiedcyberontology.org/uco/core/name": "Windows 7 Ultimate Edition",
                    "https://ontology.unifiedcyberontology.org/uco/observable/manufacturer": {
                        "@id": "http://example.org/kb/organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26"
                    },
                    "https://ontology.unifiedcyberontology.org/uco/observable/version": "6.1.7601 Service Pack 1 Build 7601",
                    "https://ontology.unifiedcyberontology.org/uco/observable/installDate": {
                        "@type": "http://www.w3.org/2001/XMLSchema#dateTime",
                        "@value": "2019-07-10T16:33:42Z"
                    }
                },
                {
                    "@type": "https://ontology.unifiedcyberontology.org/uco/observable/ComputerSpecificationFacet",
                    "https://ontology.unifiedcyberontology.org/uco/observable/biosVersion": "E1762IMS.10M",
                    "https://ontology.unifiedcyberontology.org/uco/observable/cpuFamily": "Intel Pentium i7",
                    "https://ontology.unifiedcyberontology.org/uco/observable/totalRam": 4294967296
                },
                {
                    "@type": "https://ontology.unifiedcyberontology.org/uco/observable/DomainNameFacet",
                    "https://ontology.unifiedcyberontology.org/uco/observable/value": "dfl.local",
                    "https://ontology.unifiedcyberontology.org/uco/observable/isTLD": false
                },
                {
                    "@type": "https://ontology.unifiedcyberontology.org/uco/observable/IPv4AddressFacet",
                    "https://ontology.unifiedcyberontology.org/uco/observable/addressValue": "192.168.1.145"
                },
                {
                    "@type": [
                        "http://example.org/kb/InventoryComputerFacet",
                        "https://ontology.unifiedcyberontology.org/uco/core/Facet"
                    ],
                    "http://example.org/kb/name": "DFL-03",
                    "http://example.org/kb/inventoryNumber": "10503"
                }
            ]
        }
    ]
}

JSON-LD provides a mechanism called a "context" that allows specification of particular details that allow a related body of json-ld content to be compacted to a more concise form. It also supports any amount of lossless compaction and expansion.

The Device example as it exists in the CASE examples repo currently applies a simple json-ld context to avoid having to repeatedly express IRI path detail for every object in the content. Using this simple level of compaction yields the example in this form:

{
    "@context": {
        "@vocab": "http://example.org/local#",
        "kb": "http://example.org/kb/",
        "acme": "http://custompb.acme.org/core#",
        "draft": "http://example.org/draft#",
        "uco-core": "https://ontology.unifiedcyberontology.org/uco/core/",
        "uco-identity": "https://ontology.unifiedcyberontology.org/uco/identity/",
        "uco-location": "https://ontology.unifiedcyberontology.org/uco/location/",
        "uco-observable": "https://ontology.unifiedcyberontology.org/uco/observable/",
        "xsd": "http://www.w3.org/2001/XMLSchema#"
    },
    "@graph": [
        {
            "@id": "kb:organization-c240cf37-0556-439b-9a51-1ca41732010d",
            "@type": "uco-identity:Organization",
            "uco-core:name": "Dell"
        },
        {
            "@id": "kb:organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26",
            "@type": "uco-identity:Organization",
            "uco-core:name": "Microsoft"
        },
        {
            "@id": "kb:forensic_lab_computer1-uuid",
            "@type": "uco-observable:Device",
            "location": {
                "@id": "kb:forensic_lab1-uuid"
            },
            "uco-core:hasFacet": [
                {
                    "@type": "uco-observable:DeviceFacet",
                    "uco-observable:deviceType": "Computer",
                    "uco-observable:manufacturer": {
                        "@id": "kb:organization-c240cf37-0556-439b-9a51-1ca41732010d"
                    },
                    "uco-observable:model": "Inspiron 5000",
                    "uco-observable:serialNumber": "D1234567"
                },
                {
                    "@type": "uco-observable:OperatingSystemFacet",
                    "uco-core:name": "Windows 7 Ultimate Edition",
                    "uco-observable:manufacturer": {
                        "@id": "kb:organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26"
                    },
                    "uco-observable:version": "6.1.7601 Service Pack 1 Build 7601",
                    "uco-observable:installDate": {
                        "@type": "xsd:dateTime",
                        "@value": "2019-07-10T16:33:42Z"
                    }
                },
                {
                    "@type": "uco-observable:ComputerSpecificationFacet",
                    "uco-observable:biosVersion": "E1762IMS.10M",
                    "uco-observable:cpuFamily": "Intel Pentium i7",
                    "uco-observable:totalRam": 4294967296
                },
                {
                    "@type": "uco-observable:DomainNameFacet",
                    "uco-observable:value": "dfl.local",
                    "uco-observable:isTLD": false
                },
                {
                    "@type": "uco-observable:IPv4AddressFacet",
                    "uco-observable:addressValue": "192.168.1.145"
                },
                {
                    "@type": [
                        "acme:InventoryComputerFacet",
                        "uco-core:Facet"
                    ],
                    "acme:name": "DFL-03",
                    "acme:inventoryNumber": "10503"
                }
            ]
        }
    ]
}

JSON-LD contexts can also support things like

This example specifies the context inline with the other json-ld body of content in the file and is limited to only the prefixes used for the content in the file.

JSON-LD supports the specification of context inline, as a separate file referenced from the json-ld content file, or potentially a combination of both.

For consistent and more concise use of serialized UCO json-ld content by the adopting community, a full json-ld context is needed for each version of UCO that is available for remote online reference or for local deployment and reference by json-ld serialized content.

Requirements

Requirement 1

json-ld context to support compaction of all IRI base paths through defined prefixes

Requirement 2

json-ld context to support compaction of all property type assertions

Requirement 3

json-ld context to support assertion of properties with potential cardinalities >1 as set arrrays

Requirement 4

json-ld context to support compaction of json-ld specific key strings @id, @type, @value and @graph to simple json key strings id, type, value, and graph such that the body of content can be viewed as simple json and the context can be utilized to expand it into fully codified json-ld

Requirement 5

json-ld context to support compaction of class type names and property names to prefixless names where possible (where the base names without prefixes are uniquely defined in UCO). For any base name defined in UCO that is non-unique when prefixes are removed or for any custom (not defined in UCO) class types or properties defined by the content producer, the prefixed name would be used.

This requirement is only necessary for the "concise" version of the json-ld context.

Requirement 6

Ability to autogenerate full json-ld context for each UCO release

Requirement 7

Ability to autogenerate full json-ld context for any interim UCO version

Requirement 8

Ability to publish json-ld context for each UCO release online such that produced UCO content can effectively reference and use it for json-ld processing

Requirement 9

Ability for a producer to pull down an online published json-ld context and utilize it locally with their defined content

Requirement 10

Ability for UCO content producer to specify both a reference to a remote official json-ld context for UCO and a local inline json-ld context for any custom (not defined in UCO) class types or properties defined by the content producer in their content

Risk / Benefit analysis

Benefits

Consistency of serialized content produced, exchanged and consumed by UCO community adopters.

Smaller and more concise serialized UCO content.

Ability for producers to treat UCO content as simple JSON while yielding the significant benefits of JSON-LD.

Risks

All existing UCO and CASE examples should be updated to utilize the new context and compact form.

Competencies demonstrated

Competency 1

Compaction and expansion of json-ld serialized content

Competency Question 1.1

What is the fully compacted form of a given body of json-ld serialized UCO content?

Result 1.1

The fully compacted and concise form of the json-ld serialized UCO content

Competency Question 1.2

What is the fully expanded form of a given body of json-ld serialized UCO content?

Result 1.2

The fully expanded and verbose form of the json-ld serialized UCO content

Solution suggestion

Implement code to autogenerate two different (minimal and concise) json-ld contexts for any given version of UCO. Persistently publish online the two json-ld contexts for each official release of UCO. Temporarily publish somewhere online the two json-ld contexts for each interim version of UCO.

The "minimal" json-ld context would be considered the default and would support Requirements 1 - 4. Using the scope of the Device example to provide an illustrative example of what such a context would like (the actual full context would contains details for ALL prefixes, and properties in UCO) the context would look something like:

{
    "@context": {
        "uco-core": "https://ontology.unifiedcyberontology.org/uco/core/",
        "uco-identity": "https://ontology.unifiedcyberontology.org/uco/identity/",
        "uco-location": "https://ontology.unifiedcyberontology.org/uco/location/",
        "uco-observable": "https://ontology.unifiedcyberontology.org/uco/observable/",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "uco-core:hasFacet": {
          "@type": "@id"
        },
        "uco-observable:manufacturer": {
          "@type": "@id"
        },
        ...
        "uco-core:name": {
          "@type": "xsd:string"
        },
        "uco-observable:deviceType": {
          "@type": "xsd:string"
        },
        "uco-observable:model": {
          "@type": "xsd:string"
        },
        "uco-observable:serialNumber": {
          "@type": "xsd:string"
        },
        "uco-observable:version": {
          "@type": "xsd:string"
        },
        "uco-observable:installDate": {
          "@type": "xsd:dateTime"
        },
        "uco-observable:biosVersion": {
          "@type": "xsd:string"
        },
        "uco-observable:cpuFamily": {
          "@type": "xsd:string"
        },
        "uco-observable:totalRam": {
          "@type": "xsd:integer"
        },
        "uco-observable:value": {
          "@type": "xsd:string"
        },
        "uco-observable:isTLD": {
          "@type": "xsd:boolean"
        },
        "uco-observable:addressValue": {
          "@type": "xsd:string"
        },
        "id": "@id",
        "type": "@type",
        "graph": "@graph"
    }

Utilizing this context combined with a local in-line defined context for the custom (non-UCO defined content in the body content), the Device example content would look like this:

{
    "@context": [
      "https://ontology.unifiedcyberontology.org/uco/uco-ld-context-minimal.json",
      {
        "@vocab": "http://example.org/local#",
        "kb": "http://example.org/kb/",
        "acme": "http://custompb.acme.org/core#",
        "draft": "http://example.org/draft#"
      }
    ],
    "graph": [
        {
            "id": "kb:organization-c240cf37-0556-439b-9a51-1ca41732010d",
            "type": "uco-identity:Organization",
            "uco-core:name": "Dell"
        },
        {
            "id": "kb:organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26",
            "type": "uco-identity:Organization",
            "uco-core:name": "Microsoft"
        },
        {
            "id": "kb:forensic_lab_computer1-uuid",
            "type": "uco-observable:Device",
            "uco-core:hasFacet": [
                {
                    "type": "uco-observable:DeviceFacet",
                    "uco-observable:deviceType": "Computer",
                    "uco-observable:manufacturer": "kb:organization-c240cf37-0556-439b-9a51-1ca41732010d",
                    "uco-observable:model": "Inspiron 5000",
                    "uco-observable:serialNumber": "D1234567"
                },
                {
                    "type": "uco-observable:OperatingSystemFacet",
                    "uco-core:name": "Windows 7 Ultimate Edition",
                    "uco-observable:manufacturer": "kb:organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26",
                    "uco-observable:version": "6.1.7601 Service Pack 1 Build 7601",
                    "uco-observable:installDate": "2019-07-10T16:33:42Z"
                },
                {
                    "type": "uco-observable:ComputerSpecificationFacet",
                    "uco-observable:biosVersion": "E1762IMS.10M",
                    "uco-observable:cpuFamily": "Intel Pentium i7",
                    "uco-observable:totalRam": 4294967296
                },
                {
                    "type": "uco-observable:DomainNameFacet",
                    "uco-observable:value": "dfl.local",
                    "uco-observable:isTLD": false
                },
                {
                    "type": "uco-observable:IPv4AddressFacet",
                    "uco-observable:addressValue": "192.168.1.145"
                },
                {
                    "type": [
                        "acme:InventoryComputerFacet",
                        "Facet"
                    ],
                    "acme:name": "DFL-03",
                    "acme:inventoryNumber": "10503"
                }
            ]
        }
    ]
}

The "minimal" json-ld context could be created by a coding implementation of the following pseudo-code:

image

The "concise" json-ld context would be considered optional for those who desire a very concise form and would support Requirements 1 - 5. Using the scope of the Device example to provide an illustrative example of what such a context would like (the actual full context would contains details for ALL prefixes, and properties in UCO) the context would look something like:

{
    "@context": {
        "uco-core": "https://ontology.unifiedcyberontology.org/uco/core/",
        "uco-identity": "https://ontology.unifiedcyberontology.org/uco/identity/",
        "uco-location": "https://ontology.unifiedcyberontology.org/uco/location/",
        "uco-observable": "https://ontology.unifiedcyberontology.org/uco/observable/",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        ...
        "ComputerSpecificationFacet": "uco-observable:ComputerSpecificationFacet",
        "Device": "uco-observable:Device",
        "DeviceFacet": "uco-observable:DeviceFacet",
        "DomainNameFacet": "uco-observable:DomainNameFacet",
        "Facet": "uco-core:Facet",
        "InventoryComputerFacet": "acme:InventoryComputerFacet",
        "IPv4AddressFacet": "uco-observable:IPv4AddressFacet",
        "OperatingSystemFacet": "uco-observable:OperatingSystemFacet",
        "Organization": "uco-identity:Organization",
        ...
        "hasFacet": {
          "@id": "uco-core:hasFacet",
          "@type": "@id"
        },
        "manufacturer": {
          "@id": "uco-observable:manufacturer",
          "@type": "@id"
        },
        ...
        "name": {
          "@id": "uco-core:name",
          "@type": "xsd:string"
        },
        "deviceType": {
          "@id": "uco-observable:deviceType",
          "@type": "xsd:string"
        },
        "model": {
          "@id": "uco-observable:model",
          "@type": "xsd:string"
        },
        "serialNumber": {
          "@id": "uco-observable:serialNumber",
          "@type": "xsd:string"
        },
        "version": {
          "@id": "uco-observable:version",
          "@type": "xsd:string"
        },
        "installDate": {
          "@id": "uco-observable:installDate",
          "@type": "xsd:dateTime"
        },
        "biosVersion": {
          "@id": "uco-observable:biosVersion",
          "@type": "xsd:string"
        },
        "cpuFamily": {
          "@id": "uco-observable:cpuFamily",
          "@type": "xsd:string"
        },
        "totalRam": {
          "@id": "uco-observable:totalRam",
          "@type": "xsd:integer"
        },
        "uco-observable:value": {
          "@type": "xsd:string"
        },
        "isTLD": {
          "@id": "uco-observable:isTLD",
          "@type": "xsd:boolean"
        },
        "addressValue": {
          "@id": "uco-observable:addressValue",
          "@type": "xsd:string"
        },
        "id": "@id",
        "type": "@type",
        "graph": "@graph"
    }
}

Utilizing this context combined with a local in-line defined context for the custom (non-UCO defined content in the body content), the Device example content would look like this:

{
    "@context": [
      "https://ontology.unifiedcyberontology.org/uco/uco-ld-context-concise.json",
      {
        "@vocab": "http://example.org/local#",
        "kb": "http://example.org/kb/",
        "acme": "http://custompb.acme.org/core#",
        "draft": "http://example.org/draft#"
      }
    ],
    "graph": [
        {
            "id": "kb:organization-c240cf37-0556-439b-9a51-1ca41732010d",
            "type": "Organization",
            "name": "Dell"
        },
        {
            "id": "kb:organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26",
            "type": "Organization",
            "name": "Microsoft"
        },
        {
            "id": "kb:forensic_lab_computer1-uuid",
            "type": "Device",
            "hasFacet": [
                {
                    "type": "DeviceFacet",
                    "deviceType": "Computer",
                    "manufacturer": "kb:organization-c240cf37-0556-439b-9a51-1ca41732010d",
                    "model": "Inspiron 5000",
                    "serialNumber": "D1234567"
                },
                {
                    "type": "OperatingSystemFacet",
                    "name": "Windows 7 Ultimate Edition",
                    "manufacturer": "kb:organization-cc0e0667-eadf-4b2e-9618-3f62b1bdae26",
                    "version": "6.1.7601 Service Pack 1 Build 7601",
                    "installDate": "2019-07-10T16:33:42Z"
                },
                {
                    "type": "ComputerSpecificationFacet",
                    "biosVersion": "E1762IMS.10M",
                    "cpuFamily": "Intel Pentium i7",
                    "totalRam": 4294967296
                },
                {
                    "type": "DomainNameFacet",
                    "uco-observable:value": "dfl.local",
                    "isTLD": false
                },
                {
                    "type": "IPv4AddressFacet",
                    "addressValue": "192.168.1.145"
                },
                {
                    "type": [
                        "acme:InventoryComputerFacet",
                        "Facet"
                    ],
                    "acme:name": "DFL-03",
                    "acme:inventoryNumber": "10503"
                }
            ]
        }
    ]
}

The "concise" json-ld context could be created by a coding implementation of the following pseudo-code: image

Coordination