plt-tud / r43ples

Revision Management for the Semantic Web
Other
19 stars 16 forks source link
linked-data named-graphs ontology revision-information revision-management semantic-web sparql sparql-query

R43ples

R43ples (Revision for triples) is an open source Revision Management Tool for the Semantic Web.

It provides different revisions of named graphs via a SPARQL interface. All information about revisions, changes, commits, branches and tags are stored in additional named graphs beside the original graph in an attached external triple store.

Build Status Coverity Scan Build Status codecov codebeat badge

This project provides an enhanced SPARQL endpoint for revision management of named graphs. R43ples uses an internal Jena TDB is attached to an existing SPARQL endpoint of a triplestore and acts as another endpoint both for normal SPARQL queries as well as for revision-enhanced SPARQL queries, named R43ples queries. The R43ples endpoint allows to specify revisions which should be queried for each named graph used inside a SPARQL query. The whole revision information is stored in additional graphs in the attached Jena TDB.

The website of R43ples contains further project information including Javadocs of the develop branch. A running test server should be available under http://eatld.et.tu-dresden.de:9998/r43ples/sparql

Getting Started

Dependencies

Compiling

Maven is used for compiling

mvn compile exec:java

Packages (JAR with dependencies for the webservice) can be be built with:

mvn package

Running

R43ples runs with standalone web server

java -jar target/r43ples.jar

Releases

Releases are stored on GitHub.

There are also stable and latest docker images available:

docker pull plttud/r43ples

Run default r43ples via docker

docker run -p 9998:9998 plttud/r43ples

Run with specific configuration

docker run -p 9998:9998 -v $PWD/r43ples.conf:/r43ples.conf plttud/r43ples

Configuration

There is a configuration file named resources/r43ples.conf. The most important ones are the following:

The logging configuration is stored in resources/log4j.properties

Interfaces

Extended SPARQL endpoint

SPARQL endpoint is available at:

[uri]:[port]/r43ples/sparql

The endpoint directly accepts SPARQL queries with HTTP GET or HTTP POST parameters for query and format:

[uri]:[port]/r43ples/sparql?query=[]&format=[]

Supported Formats

The formats can be specified as URL Path Parameter format, as HTTP post paramter format or as HTTP header parameter Accept:

R43ples keywords

There are some additional keywords which extends SPARQL and can be used to control the revisions of graphs:

Query Rewriting option

There is a new option for R43ples which improves the performance. The necessary revision is not temporarily generated anymore. The SPARQL query is rewritten in such a way that the branch and the change sets are directly joined inside the query. This includes the order of the change sets. It is currently under development and further research.

The option can be enabled by passing an additional parameter "query_rewriting=true"

It currently supports:

For more details, have a look into the doc/ directory.

Debug SPARQL endpoint

R43ples redirects the queries performed on the debug endpoint directly to the attached triplstore. Thus, this endpoint can be used for debugging purposes.

[uri]:[port]/r43ples/debug

API

R43ples provides some functionalities additionally via an external API, even if all information can also be queried directly from the triplestore

Concept of R43ples

Extended SPARQL proxy

R43ples itself does not story any information. All information in the revised graphs and about the revised graphs are stored in the attached triplestore. R43ples acts only as a proxy which evaluates additional revision information in the SPARQL queries.

System Structure

Revision information

All information about the revision history of all named graphs is stored in the named graph http://eatld.et.tu-dresden.de/r43ples-revisions (as long as not configured otherwise in the configuration file).

Here, the Revision Management Ontology (RMO) is used to model revisions, branches and tags. Furthermore commits are stored which connect each revision, tag and branch with its prior revision.

The RMO is derived from the PROV ontology: RMO example

An exemplary revision graph is shown here: ![RMO example](./doc/revision management description/r43ples-creategraph.png)

HTTP Header information

Each response header contains information about the revision information of the graphs specified in the requests in the r43ples-revisiongraph HTTP header field. This information follows the RMO and is transferred as Turtle serialization.

Clients can also pass this information in R43ples update queries to the R43ples server via the r43ples-revisiongraph HTTP header attribute. The server will check if the client is aware of the most recent version of the involved revised graphs. If this is not the case, the update query will be rejected.

Used libraries and frameworks

Following libraries are used in R43ples:

Literature

Following articles describe the funcionalities and rationales of R43ples: