rat10 / nng

Semantic Graphs
0 stars 1 forks source link

Nested Named Graphs (NNG)

Overview

Nested Named Graphs (NNG) is a proposal to the RDF 1.2 Working Group [0]. It provides a simple facility to enable annotations on RDF in RDF.

The proposal doesn't require any changes or additions to the abstract syntax of RDF [1] and can be deployed in quad stores that support RDF 1.1 datasets [2]. It is realized as a combination of syntactic sugar added to the popular TriG syntax [3] and a vocabulary to ensure sound semantics [4]. Mappings to triple-based approaches are provided. The proposal can be tested in a publicly accessible prototype implementation with SPARQL [5] support.

After an initial version of this proposal has been presented to the RDF 1.2 Wg (see below in the "attic" section), a few concerns and requests have been voiced: more examples, shorter examples, formalization, shorter presentation, etc. Some of these requests obviously contradict each other, some are hard to meet at this point. This small site tries to provide both a short introduction in this section and more detailed discussions in separate sections (see links below).

Please be aware that annotations in RDF are a pretty complex topic and the RDF-star proposal, although apparently simple, fails to address those complexities in meaningful ways. However, a standard that aims to take shortcuts without properly thinking through the consequences will not do anyone a favour. A useful annotation mechanism has to be simple at the core, the way to develop it however obviously isn't. The NNG proposal addresses a lot of concerns that RDF-star glosses over, but which shouldn't be ignored. So please take the necessary time to consider this proposal.

Concept

Nested Named Graphs aim to integrate into existing applications without getting in the way of less involved use cases:

The design of Nested Named Graphs aims for the least surprise. Annotated or not, they are always:

Syntax

The main component of the proposal is a syntactic extension to TriG that adds the ability to nest named graphs inside each other. The following short example may give a first impression of its various virtues:

prefix :    <http://ex.org/>
prefix nng: <http://nng.io/>
:G1 {
    :G2 {
        :Alice :buys :Car .
        :G2 nng:domain [ :age 20 ] ;           # Alice, not the car, is 20 years old
            nng:relation [ :payment :Cash ] ;  
            nng:range nng:Interpretation ,     # Alice buys a car, not a website
                       [ :color :black ].  
    } :source :Denis ;                         # an annotation on the graph
      :purpose :JoyRiding .                    # sloppy, but not too ambiguous
    :G3 {    
        [] {                                   # graphs may be named by blank nodes 
            :Alice :buys :Car .                # probably a different car buying event
            THIS nng:domain [ :age 28 ] .      # self reference
        } :source :Eve .    
    } :todo :AddDetail .                       # add detail
}                                              # then remove this level of nesting
                                               # without changing the data topology

The same as N-Quads:

<http://ex.org/G1>    <http://nng.io/transcludes> <http://ex.org/G2>                                <http://ex.org/G1> .
<http://ex.org/Alice> <http://ex.org/buys>        <http://ex.org/Car>                               <http://ex.org/G2> .
<http://ex.org/G2>    <http://nng.io/subject>     _:o-37                                            <http://ex.org/G2> .
_:o-37                <http://ex.org/age>         "20"^^<http://www.w3.org/2001/XMLSchema#integer>  <http://ex.org/G2> .
<http://ex.org/G2>    <http://nng.io/predicate>   _:o-38                                            <http://ex.org/G2> .
_:o-38                <http://ex.org/payment>     <http://ex.org/Cash>                              <http://ex.org/G2> .
<http://ex.org/G2>    <http://nng.io/object>      <http://nng.io/Interpretation>                    <http://ex.org/G2> .
<http://ex.org/G2>    <http://nng.io/object>      _:o-39                                            <http://ex.org/G2> .
_:o-39                <http://ex.org/color>       <http://ex.org/black>                             <http://ex.org/G2> .
<http://ex.org/G2>    <http://ex.org/source>      <http://ex.org/Denis>                             <http://ex.org/G1> .
<http://ex.org/G2>    <http://ex.org/purpose>     <http://ex.org/JoyRiding>                         <http://ex.org/G1> .
<http://ex.org/G1>    <http://nng.io/transcludes> <http://ex.org/G3>                                <http://ex.org/G1> .
<http://ex.org/G3>    <http://nng.io/transcludes> _:b41                                             <http://ex.org/G3> .
_:b41                 <http://nng.io/subject>     _:o-42                                            _:b41 .
<http://ex.org/Alice> <http://ex.org/buys>        <http://ex.org/Car>                               _:b41 .
_:o-42                <http://ex.org/age>         "28"^^<http://www.w3.org/2001/XMLSchema#integer>  _:b41 .
_:b41                 <http://ex.org/source>      <http://ex.org/Eve>                               <http://ex.org/G3> .
<http://ex.org/G3>    <http://ex.org/todo>        <http://ex.org/AddDetail>                         <http://ex.org/G1> .

For a more extensive set of simple examples check out the Introduction by Example

A complementary syntactic extension to JSON-LD remains TBD.

Mappings to triple-based formats like Turtle and N-Triples are provided (or worked on). A mapping to RDF/XML so far isn't planned, but might be based on RDF/XML's syntactic sugar for RDF standard reification.

See also an example of a BNF for the NNG syntax - not exactly but close to the version actually deployed in the prototype notebook (see below).

Fragments and Identification Semantics

The introducing example makes use of a fragment identification vocabulary to annotate individual terms on a statement. This can be helpful e.g. to faithfully represent Labeled Property Graphs in RDF as it allows to clearly separate e.g. provenance annotations on the whole graph from qualifications of the relation type.

Fragment identification also comes in handy when identification semantics need to be disambiguated, e.g. to clarify if an IRI is used to refer to a web resource or to the entity that web resource describes. Another possible application is the disambiguation of graph naming semantics when using the graph name in a statement, e.g. an annotation to the graph.

Configurable Semantics

RDF standard reification is often used to document "unasserted assertions", that is statements that are to be documented but not endorsed (although that interpretation of the reification vocabulary is not supported by the RDF specification). The RDF-star CG report favors a related form of citation, but with stricter semantics. The NNG proposal supports both use cases with specific syntactic sugar, see the section on citation semantics.

The underlying mechanism of configurable inclusion of graph literals can be used for much more elaborate configurable semantics use cases, e.g. to support closed world semantics on selected parts of the data.

Querying

The discussion of matters related to querying is not finished yet. Simple querying tasks are pretty straightforward. Querying for statements with non-standard semantics is straightforward as well.
For some of the more complicated questions w.r.t graph nesting and query traversal of nested graphs see an example walk through and the accompanying shell script. See also comparison to RDFn However, for querying of nested graphs to become as easy as authoring them the traversal of chains of nested graphs has to become easier than it is now. SPARQLs lack of support for queries across graphs is a problem here. The example walk through provides a solution, but it's not easy enough yet. We are currently investigating what to do about that.

Public Notebook

A prototype implementation in the Dydra graph store [6], including an appropriate extension to SPARQL, provides a public notebook that can be used to explore, test and play around with the proposal.
Be aware however of a few caveats. The notebook is not multi-user enabled: if two users play with it at the same time, they may overwrite each others sample data. Also it doesn't yet support all syntactic sugar; especially support of syntactic sugar for graph literals is still sketchy. And last not least it is not helpful w.r.t. syntax errors: it won't point out to you where you forgot a punctuation mark.

Details

Semantics

Two different approaches to the semantics of Nested Named Graphs are possible. One may either define the semantics via a mapping to triples or via the provision of means to describe the semantics of named graphs.

Mappings to triples can go three ways: they can follow the singleton properties approach [8], providing a still quite usable surface syntax, or a fluents based approach [9] which has more favorable entailment properties but tends to alienate non-expert users (whereas user studies have found that expert users do indeed like its straightforwardness) and finally they can also go the way of n-ary relations [11]. These mappings all stay close to standard RDF and are discussed in detail in an extra section.

Defining Nested Named Graphs as an extension of named graphs is syntactically straightforward, but has two downsides. It requires support of quads in an RDF store which, while certainly quite common, is neither the norm nor can it be considered ubiquitous. A second problem is the undefined state of named graph semantics, which the RDF 1.1 WG failed to resolve. Our proposal does not impose a more definite semantics on everybody, but provides a means to specify a clearer semantics on demand. To that end we provide a small vocabulary to solidly define the semantics of named graphs - nested or not -, as a default arrangement per dataset or per graph individually via the SPARQL dataset description vocabulary. No change whatsoever to already deployed named graphs is required.

Instead, using the proposed nesting mechanism implicitly fixes the semantics. However, this fixing does only apply to the context of nesting, just like addressing a graph in a WHERE clause unambiguously makes the name address the graph without any further consequences on the naming semantics of named graphs in general (e.g. when using the graph name outside a WHERE clause). Our proposal defines the semantics of nested graphs in a way reflecting users intuitions, bridging the gap between the abstract definitions in RDF and actual practice:

The Architecture and Politics of Semantics

We would like to be quite clear about our approach to semantics: the proposed semantics doesn't change anything for anybody in practice, it just reflects realities that nobody can escape anyway. It rejects however all approaches to prolong the mismatch between the abstract set-based type semantics of RDF and its predominantly token-based reality. In that respect NNG take a strong stance opposite to the semantics proposed by the RDF-star CG or any other approaches that insist on understanding named graphs as opaque types - approaches that in our opinion make the logically safe formalistic tail wag with the very practical semantic web dog.

We've spoken to many SemWeb'ers - rather pedestrian users and implementors as well as well-respected academics - that couldn't care less about the model-theoretic semantics of RDF anyway, but see RDF's virtues mainly as an interchange syntax supplemented by a rough consensus about meaning via shared vocabularies. That it the base for which this approach designs the semantics of Nested Named Graphs. Ignoring those people's stance will only make the already bad standing of formal semantics on the semantic web even worse. We do however claim that their intuitions can be matched and incorporated into the semantics foundations of RDF without breaking anything, by adding an additional layer of interpretation as a means to separate concerns.

Note that this proposal does not dismiss the more abstract logic-oriented applications like reasoning over graph entailments. To that end it provides a separate mechanism tailored to use-cases that need to describe and reason about graph types: graph literals. Graph literals do not only provide a sound means to describe and reason about graphs as abstract types, they also provide the basic primitive to implement non-standard semantics via an additional inclusion mechanism. Syntactic sugar is provided for popular use cases like un-asserted assertions and referentially opaque quotation, but the mechanism itself is extensible as desired and an example vocabulary to support such extensions is provided.

Design Considerations

Metamodelling in RDF - annotating, contextualizing, reifying simple statements to better deal with complex knowledge representation needs - has been the focus of work as long as RDF itself exists. For an extensive treatment of the topic check the 300+ pages "Between Facts and Knowledge - Issues of Representation on the Semantic Web" (PDF). One thing we learned from this huge corpus of works is that the one magic trick to resolve all the problems around complex modelling tasks in RDF most probably doesn't exist: the needs and expectations w.r.t. meta-modelling in RDF are so diverse that probably only a clever combination of techniques can meet them all reasonably well. Consequently we need to get creative, and we need to break some rules:

Use Cases

The RDF 1.2 WG is still consolidating use cases [7]. On an abstract level the NNG approach has concentrated on meeting the following demands:

The examples illustrate how Nested Named Graphs meet those demands.

Implementation

For a rough impression about one way to implement this, check the note and the diffs in the Dydra directory

Attic

The Semantic/Nested Named Graphs proposal was presented to the RDF 1.2 WG by means of

Discussions in the WG led to some modifications that resulted in this version here, which is completely conformant to the existing RDF 1.1 model and abstract syntax, at the expense of a bit of semantic rigidity. Some aspects like the inclusion/transclusion mechanism have undergone some changes too, so consult those older texts with caution.

References

[0] RDF 1.2 Working Group
[1] RDF 1.1 Concepts and Abstract Syntax
[2] RDF 1.1: On Semantics of RDF Datasets
[3] TriG
[4] RDF 1.1 Semantics
[5] SPARQL 1.1
[6] Dydra graph store
[7] RDF 1.2 WG Use Cases
[8] Singleton Properties [9] NdFluents [10] Pat Hayes' mail to the RDF 1.1 WG (many thanks to Niklas for unearthing this!) [11] W3C Note Defining N-ary Relations on the Semantic Web