w3c / wot

Web of Things
http://www.w3.org/WoT/IG/
214 stars 126 forks source link

[TF-TD] mapping to RDF #283

Closed draggett closed 4 years ago

draggett commented 7 years ago

I've implemented an online testbed for translating thing descriptions to RDF triples. Note that this demo uses the JSON based format I have been exploring in my NodeJS project and differs from the one given in the current practices.

The idea is to describe the object model exposed to applications in a way that should be easily understandable for web developers, and which also provides a straightforward mapping to RDF. It is emphatically not a goal to be able to express all the richness of RDF. This simplifies the format compared to JSON-LD. You can link to rich semantic models expressed in other formats, but this isn't something that constrained devices need to retrieve and interpret.

Context declarations using "@context" map short names to absolute URIs and can be given as a link to a file or inlined directly. "@context" declarations can be given at multiple level, and the inner scope overrides definitions in outer scopes. You can define a term as null if you want it to be translated as a string literal and need to block inheritance from outer scopes. Re-usable type declarations can be defined with "types", see the "ECG+Accelerometer" example.

One open question is whether the names of properties, actions and events can be mapped to URIs via @context or whether it is safer to leave the binding to semantic terms to separate declarations, for instance, by metadata asserting that this model conforms to some semantic model.

I think that it is reasonable to assume that properties are writeable by default. You can indicate that a property is readonly using "writeable:false". The "latency" metadata term defines the expected interval between updates, and can be used together with the "source" metadata that gives the URI for the source of a stream of updates, see the "ECG" demo. The "sink" metadata term is used where the application streams updates to a thing or to a property of a thing. We also need a term for a URI that acts as both source and sink, any suggestions? The "protocol" can be used to provide additional information, e.g. for how to use web sockets or server sent events.

The "Color choices" example in the demo illustrates how to support arrays and choices. A set of choices is defined as a list of values. Likewise, for unions as a list of types, something which could be useful for late bound types. Note that choices are unordered unlike enumerations which strictly speaking are usually defined as an ordered set of choices. See "Shopping basket" for the example illustrating unions.

Google's Protocol Buffers provide a text based schema format for messages. Each field (and enumerated value) is associated with a unique integer. This allows servers to support different versions of messages through the use of different integers to identify different interpretations. Do we need to allow these numbers to be explicitly provided as part of thing descriptions? In principle, thing descriptions could be mapped to protocol buffer message schemas for events, property updates, action requests and responses. The field numbers could be generated automatically. A comparison of thing descriptions could allow a processor to work out when the protocol buffer field numbers can be re-used or when new values are needed because the meaning is different. This suggests the need for modelling a sequence of versions of a thing or set of things.

More generally when there is a need to define thing descriptions for existing services, it may be appropriate for the thing description to link to a schema for the given service, using a platform specific schema language. This may entail the need for compound identifiers such as paths, e.g. when mapping a thing property to a resource path on a REST based server.

Some programming languages allow multiple versions of the same function name with different signatures (number and types of arguments), but other languages like JavaScript do not, so such declarations would cause problems in generating the corresponding object. The JSON representation for actions uses an object rather than an array to avoid this problem.

The current translation doesn't represent the ordering of arguments for action requests and responses. Given that arguments are named this isn't critical. However, arguments may be assigned a a unique integer number using the metadata term "number". This would allow for passing arguments based upon their position. Arguments can also be marked as required or optional (e.g. "required":true). Should the default be required or optional? Default values can be declared if needed for optional arguments. Arguments can also be marked as ordered or unordered collections.

Types can be declared and referenced multiple times, however, I am not quite sure how to model this accurately in RDF. For now I state that the given type is a subclass of type.

In principle, the representation could adopt further features from JSON-LD, but perhaps not in the first version.

The demo is available at: https://www.w3.org/WoT/demos/td2ttl/

p.s. the demo currently generates RDF as either N-triples or JSON-LD. Future work is planned on support for Turtle. Note that I have deliberately used example.org and example.com for the context declarations as a way to indicate that these are for example purposes only. They would need to be replaced by whatever standard URIs the Working Group agrees to.

draggett commented 7 years ago

RDF model of Thing Descriptions

This is an informal definition and will be formalised as an ontology or as a shape grammar

Datatypes

Metadata

Note that this platform to platform contract URI should be dereferenceable as Linked Data.

Imports

draggett commented 7 years ago

I am working on a JavaScript based testbed for validating thing descriptions in RDF. This is based upon shape grammars which select nodes in an RDF graph that match the root of the grammar and traverse the links from the root node applying a sequence of filters and constraints. The W3C RDF Shapes Working Group is standardizing a representation of such grammars in RDF, see SHACL.

draggett commented 7 years ago

A very promising approach is based upon augmented transition networks - an approach that I am calling the Shape Rule Language (SHRL), pronounced as "shurl". See diagram below for example:

shape rule language

Here is what the rule looks like in Turtle:

ex:rule1
    a sh:Shape ;
    sh:targetClass td:thing ;
    sh:and ex:rule2  .
ex:rule2
    sh:rel td:property ;
    sh:and ex:rule3 , ex:rule2 .
ex:rule3
    rdfs:comment "Every property must have exactly one name" ;
    sh:rel td:name ;
    sh:minCount 1 ;
    sh:maxCount 1 .

More information is available in this slide deck: shape rule language.pdf which has been implemented as an open source JavaScript library and test suite.

draggett commented 7 years ago

Shaun Howell did some work on automatically generating a diagram of a specific thing from its RDF representation. He says:

I've spent some time on visualising TD models as an extension of your demo. Attached is the source code and below is a screenshot. I wrote a simple library for representing Graphviz [1] objects and added functions to your 'td2rdf.js' which convert the N-triple representation to Graphviz DOT format and then updates an SVG on the page. There are some bugs but you can see where its heading from this demo. URLs are hyperlinked, and shortened using prefixes. It may be useful to use color to differentiate properties, events, and actions.

water

I've now extended the JavaScript port of GraphViz (viz.js) to function as a web worker and applied it to creating diagrams for thing descriptions in RDF and for shape rules in SHRL.

draggett commented 7 years ago

JSON Thing Descriptions and translation to RDF

This is formalisation of JSON thing descriptions and their mapping to RDF using a context free grammar.

JSON is text based format for structured data that can be processed in a variety of programming languages and is very popular on the Web.

RDF can be used for graphs of nodes connected by directed labelled arcs. Each arc is modelled as a triple consisting of its subject, predicate and object, where the predicate acts as the arc's label. The subject and predicate are named nodes. The object is a named node or RDF literal. Named nodes are either URIs acting as globally unique identifiers or are blank nodes, which are identifiers that are locally scoped to a given graph. RDF nodes identified by URIs should be dereferenceable to RDF graphs. RDF literals include strings, numbers, true and false. There are multiple serialisation formats for RDF, and the most common are N-triples and Turtle (a superset of N-triples). For greater legibility, URIs are often abbreviated using a prefix:suffix notation where the prefix names a URI to be concatenated with the suffix. An RDF graph may be associated with a set of such prefixes.

The things in the Web of things stand for physical or abstract entities and are exposed to applications as objects with properties, actions, events and other metadata. Each thing has a URI that can be dereferenced to obtain the thing's description. Thing descriptions can be transformed into RDF graphs, and to object models for specific programming languages. The thing description acts as a contract between the application developer and the application platform developer. A thing description may also include information that allows the application platform to identify contracts between platforms, so that the application platform can utilise the appropriate software drivers for communicating with other platforms.

Thing descriptions can be represented in JSON as a concise and easy to understand format for application developers. Valid JSON thing descriptions can be formalised as a context free grammar, where the grammar rules are annotated with instructions that defines the mapping to RDF. The starting point is the URI for the thing description and its representation as JSON object (see above). Within the JSON thing description and its nested objects, all names starting with the @ character are reserved.

THING ⇒ {
    CONTEXT?
    "types":TYPES?
    "properties" : PROPERTIES?
    "actions" : ACTIONS?
    "events" : EVENTS?
    META*
}
CONTEXT ⇒ URI  // reference to CONTEXTOBJECT
CONTEXTOBJECT ⇒ { PREFIXOBJECT? NAME : URI* }
PREFIXOBJECT ⇒ { PREFIX: URI* }
TYPES ⇒ { TNAME : TYPE* }
PROPERTIES ⇒ { NAME : TYPE* }
ACTIONS ⇒ { NAME : ACTION* }
ACTION ⇒ { CONTEXT? "request" : REQUEST "response" : RESPONSE? }
REQUEST ⇒ { NAME : TYPE* }
RESPONSE ⇒ { NAME : TYPE* }
EVENTS ⇒ { NAME : TYPE* }
META ⇒ NAME : VALUE
VALUE ⇒ NAME | NUMBER | true | false | URI | [ VALUE* ]
TYPE ⇒ TYPENAME | [ NAME? ] | TYPEOBJECT
TYPENAME ⇒ null | boolean | integer | number | string | thing | TNAME
TYPEOBJECT ⇒ {
    CONTEXT?
    "type" : TYPENAME
    "enum" : [ NAME? ]?
    "union" : [ TYPE+ ]?
    "vector" : [ NAME? ]?
    "collection" : ("ordered" | "unordered") ?
    "properties" : PROPERTIES?
    "units: UNITS?
    "min" : NUMBER?
    "max" : NUMBER?
    "minLength" : NUMBER?
    "maxLength" : NUMBER?
    "rate" : NUMBER?
    "latency" : NUMBER?
    "service" : URI
    "source" : URI
    "sink" : URI
    "platform" : URI
    "import" : URI
    "optional" : true | false
    "writeable" : true | false
    META*
}

A context free grammar for JSON Thing descriptions. Note that the postfix operators ? + and * take their usual meaning, but apply to name value pairs as a whole rather than just to the value part, as the elision of the round grouping brackets increases legibility.

Translating to RDF

This section annotates each grammar rule with instructions for generating the corresponding RDF triples. The examples make use of the following prefixes:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>
@prefix sh: <http://www.w3.org/ns/shrl#>
@prefix td: <http://www.w3.org/ns/td#>
@prefix ex: <http://example.com/>

The instances of the blank nodes in the example code is for illustrative purposes, and would be named in such a way as to avoid conflicts with previously defined blank nodes.

Thing description object:

The JSON object for a thing description also reserves the names "properties", "actions", "events" and "types". The value for each of these must be a JSON object where the names declare names for the thing's properties, actions, events and types. Non reserved names for the thing description object are treated as metadata and mapped to RDF as defined below for the META grammar rule.

THING ⇒ {
    CONTEXT?
    "types":TYPES?
    "properties" : PROPERTIES?
    "actions" : ACTIONS?
    "events" : EVENTS?
    META*
}

The URI for thing description e.g. ex:thing is used to generate the following triple:

ex:thing rdf:type td:type .

JSON contexts

CONTEXTOBJECT ⇒ { PREFIXOBJECT? NAME : URI* }
PREFIXOBJECT ⇒ { PREFIX: URI* }

"@context" is used with either a context object, a string that defines a URI that deferences to a context object or an array of such strings or context objects. The context object maps names to URIs for RDF named nodes. Except where otherwise stated, names and string values in JSON objects that don't match the URI pattern, should be mapped to URIs via the context chain. "@context" may be used with any object in the hierarchy of objects defining a JSON thing description. The bindings are searched for starting from the current object this its patent object and its parent's parent and so forth. The URI is taken as the first definition for a given name that is found in this iterative process. If there are conflicting definitions in context objects in the same scope, the latest lexically wins out.

Contexts may declare RDF prefix bindings to URIs. This is valuable for improved legibility when serializing RDF Graphs to Turtle. Note that it is invalid for a given prefix to be bound to more than one URI for any given RDF graph.

Note that the context grammar rules do not themselves generate any RDF, but rather assist in mapping shorthand names in JSON into RDF named nodes.

Reusable type declarations

TYPES ⇒ { TNAME : TYPE* }

The "types" object is used to define named types analogous to typedef in the C programming language. This allows single definition to be given and referred to in multiple places in the thing description. There are a number of predefined type names: null, boolean, integer, number, string and thing. In addition, there is support for enumerations, unions, vectors and collections. Types may be associated with additional metadata. TNAME is any valid field name, as per NAME.

Each named type defined in the types section is mapped into a triple using a freshly coined RDF blank node. If the name can be resolved using the context chain into am RDF named node, the subject is the corresponding URI, other the subject is an RDF string literal for the name. This applies generally to names for types, properties, actions, and events, including the properties carried by action requests and responses.

ex:thing td:typdef _:1 .
_:1 td:name "acceleration" .

Properties:

PROPERTIES ⇒ { NAME : TYPE* }

Each named property is mapped into a couple of triples, using a freshly coined RDF blank node, e.g. for a property named "lower":

ex:thing td:property _:2 .
_:2 td:name "lower" .

For nested properties, the thing's URI is replaced by the blank node for the parent property.

Actions

The value of each action must be a JSON object which must include the name "request" and may define the name "response". The request and response must be JSON objects that declare names for the data passed in the request or response.

ACTIONS ⇒ { NAME : ACTION* }
ACTION ⇒ { CONTEXT? "request" : REQUEST "response" : RESPONSE? }
REQUEST ⇒ {  NAME : TYPE* }
RESPONSE ⇒ { NAME : TYPE* }

Each named action is mapped into triples using a freshly coined RDF blank node. The request and response are also assigned new blank nodes and linked from the action node. The following example is for an action named "weather" with a request and a response.

ex:thing td:action _:3 .
_:3 td:name "weather" .
_:3 td:request _:4 .
_:3 td:response _:5 .

Events

EVENTS ⇒ { NAME : TYPE* }

Each named event is mapped into the following triples, using a freshly coined RDF blank node. Here is an example for an event named "high_water":

ex:thing td:event _:6 .
_:6 td:name "high_water" .

Metadata

META ⇒ NAME : VALUE
VALUE ⇒ NAME | NUMBER | true | false | URI | [ VALUE* ]

The name must be not belong to the reserved set. For metadata declared on the thing object itself, the reserved set includes: "properties", "actions", "events", "types", and any name beginning with @. For metadata defined on type objects, the reserved set additionally includes: "type", "enum", "union", "vector", "collection", "units", "min", "max", "minLength", "maxLength", "rate", "latency", "service", "source", "sink", "platform", "import", "optional" and "writeable".

Each metadata item is mapped in to a single triple where the object is the parent URI, the predicate is resolved from the metadata name, and the value from the value. For metadata declared on the thing object itself, the parentURI is that of the thing description, e.g. ex:thing. Otherwise it is the blank node for the parent property, action, event or type. Here are some examples:

ex:thing ex:owner "Jane Smith" .
_:12 ex:updated "2015-02-04 22:45:00" .

Type information

The value for each named property, request field, response field, event and type take the following form:

"properties" is used for a properties object for nested properties. "type" must be used for the name of a predefined type or a type name defined in the types object. "enum" is used with an array of string literals as an enumeration of names. "union" is used with an array of type names or type objects. "vector" is used with an array of string literals. "collection" must be either "ordered" or "unordered". "service", "source", "sink", "platform" and "import" are used with string literals for relative or absolute URIs. "min", "max", "minLength", "maxLength", "rate", "latency" must be used with numeric values. "optional" and "writeable" must be used with boolean values (true or false).

TYPE ⇒ TYPENAME | [ NAME? ] | TYPEOBJECT
TYPENAME ⇒ null | boolean | integer | number | string | thing | TNAME
TYPEOBJECT ⇒ {
    CONTEXT?
    "type" : TYPENAME
    "enum" : [ NAME? ]?
    "union" : [ TYPE+ ]?
    "vector" : [ NAME? ]?
    "collection" : ("ordered" | "unordered") ?
    "properties" : PROPERTIES?
    "units: UNITS?
    "min" : NUMBER?
    "max" : NUMBER?
    "minLength" : NUMBER?
    "maxLength" : NUMBER?
    "rate" : NUMBER?
    "latency" : NUMBER?
    "service" : URI
    "source" : URI
    "sink" : URI
    "platform" : URI
    "import" : URI
    "optional" : true | false
    "writeable" : true | false
    META*
}

By default, these are translated to RDF by resolving the field name to a predicate and resolving the value to an RDF named node or literal. Exceptions include type, enum, union, vector and properties. Properties are translated as described earlier on.

Type is translated into RDF by resolving the name to either the RDF named node for a predefined type or value, or to a blank node generated for a named type declaration in the types object, e.g.

_:22 td:type td:boolean
_:23 td:type _:3

If type is given as null, this is translated as the absence of a triple with type as its predicate.

For enums, the value for type is ignored and td:type's object recorded as td:enum. Each enumerated name is recorded with the predicate td:item. For example, here is a property named "rank" that is defined as an enumeration over the set of names "low", "medium" and "high".

ex:colours td:property _:1 .
_:1 td:name "rank" .
_:1 td:type td:enum .
_:1 td:item "low" .
_:1 td:item "medium" .
_:1 td:item "high" .

For unions, the type is recorded with the object td:union. For each type in the union, a triple is recorded with the predicate td:type and the object as the RDF named node for predefined types, and a blank node for types declared in the thing description. Here is an example that refers to several types declared the types section of the thing description:

_:1 td:type td:union .
_:1 td:item _:2 .
_:1 td:item _:3 .
_:1 td:item _:4 .

For vectors, the value for type is translated using the predicate td:itemType, and the td:type is set to td:vector. Each named item defined for the vector is assigned a new blank node and recorded with its name and zero based numeric position in the array, e.g. the following declares a type named "point" for a numeric vector with "x", "y", "z" axes.

thingURI td:typedef _:1
_:1 td:name "point"
_:1 td:type td:vector
_:1 td:item _:2
_:2 td:order 0
_:2 td:name "x"
_:1 td:item _:3
_:3 td:order 1
_:3 td:name "y"
_:1 td:item _:4
_:4 td:order 2
_:4 td:name "z"
_:1 td:itemType td:number

Import

The import declaration is used when you want to compose a thing description in terms of externally defined thing descriptions. The value of the "import" name value pair is a URI for the external thing description. Each import statement maps to a single RDF triple with the subject as this thing's URI, the predicate as td:import and the object as the URI for the referenced thing's description, e.g.

ex:colorControl td:import ex:binarySwitch .
ex:colorControl td:import ex:colorSaturation .
ex:colorControl td:import ex:rampTime .

The referenced RDF graph may be imported into the referencing graph, subject to renaming imported blank nodes to avoid any conflicts. The resulting composition must be a valid thing description.

gkellogg commented 7 years ago

Changes to the JSON-LD specs to support this are in json-ld/json-ld.org#449. This is also implemented in my implementation in Ruby.

I expect changes will make it into the editors drafts of JSON-LD and JSON-LD-API in about a week.

vcharpenay commented 7 years ago

The serialization of RDF graphs in DOT is actually a good idea. Since it is a textual format, the WG could use it to design the abstract Thing model. It is also possible to serialize RDFS/OWL ontologies in DOT, using simple SPARQL CONSTRUCT queries, which means we could more easily compare your work with the ontologies for WoT that have been proposed: from Siemens and from UPM.

Is your code versioned somewhere? In particular, I think the community would benefit from aligning the libraries rdf.js and rdf2dot.js with the W3C RDF Interfaces specification. There exists several npm modules implementing it equivalent to your rdf.js and I have added an extension for DOT serialization similar to rdf2dot.js (see https://www.npmjs.com/package/rdf-dot). I hope to extend it in a near future to render the same output as in your demo.

draggett commented 7 years ago

I did start by looking at the W3C Note for RDF Interfaces, and copied some interfaces from that, but soon found that the rest weren't particularly useful for the kinds of tasks needed for the SHRL shape rules validator. I aim to keep working on the validator to extend the suite of tests to cover different aspects of thing descriptions, as well as semantic interoperability of compositions of things. I will eventually put all of the code on Github, but for now it is available via the demos on the W3C server.

The following diagram shows most of the linked data model for the object model for things that I developed in 2016 based upon a survey of IoT platforms. I am continuing to expand that survey and aim to publish it as a W3C Note. I am currently considering use cases for Linked Data streams and the potential requirement for RDF triples and graphs as first class data types. This focuses on the Web of things and Linked Data as an abstraction layer for integration within and between enterprises. In essence the Web of things isn't just about constrained devices and the network edge.

Different purposes generally call for different diagrams. In Santa Clara, we discussed the role of UML entity relationship diagrams as a potential means to abstract away from Linked Data. When we want to be specific about Linked Data, then shape grammars are perhaps easier to understand. Click on the diagram below for a full sized version.

wot-ld-model

egekorkan commented 4 years ago

Given that TDs are JSON-LD based, this is doable with any JSON-LD library as far as I know. I am closing but it can be opened in the wot-thing-description repository