kbss-cvut / s-pipes

Tool for execution of RDF-based pipelines.
GNU Lesser General Public License v3.0
4 stars 5 forks source link

SPipes

SPipes is a tool to manage semantic pipelines defined in RDF inspired by SPARQLMotion language. Each node in a pipeline represents some stateless transformation of data. SPipes Editor is a tool for viewing/editing/executing/debugging SPipes scripts.

Basic concepts

All terms defined in this section refer to SPipes terminology:

SPipes Features

Loading of Pipelines

SPipes loads pipelines by recursive traversal of configured directories, searching for ontology files represented by .ttl suffix. Global scripts are represented by suffix .sms.ttl. A script is identified by ontology IRI in which it is defined. Ontology imports (using rdf property owl:imports) can be used to modularize scripts into multiple files. The script defines a set of pipelines from its ontology import closure.

Examples

SPipes script construction, execution, and execution history tracking are explained in Hello world example. Script debugging is explained in skosify example. Working with RDF4J repository is explained in rdf4j example. Constraint validation is described in constraint validation example.

Structure of Maven Submodules

SPipes Core

Maven module SPipes Core provides core functionality related to the SPipes engine, ontology manager, and auditing. It contains the configuration file config-core.properties, where directories of scripts for loading are configured.

SPipes Web

Web user interface for SPipes that allows to execute any function defined in global scripts. The function can be called by HTTP GET request

$WEB_APP_URL/service?_pId=$FUNCTION&$PARAM_NAME_1=$PARAM_VALUE_1&$PARAM_NAME_2=$PARAM_VALUE_2..., where

Example call: https://localhost:8080/s-pipes/service?_pId=my-function&repositoryName=myRepository

In addition, there is a list of reserved parameter names.

SPipes CLI

Maven module SPipes CLI provides a command-line interface to the SPipes engine. In addition to config-core.properties, directories configured to load scripts can be overridden by command-line variable SPIPES_ONTOLOGIES_PATH. For E.g. in the UNIX shell, the following command can be used: export SPIPES_ONTOLOGIES_PATH="/home/someuser/s-pipes-scripts"

SPipes Modules Registry

Defines dependencies of all specific module types that are used in Web and Cli interface at the same time.

SPipes Modules

Contains specific SPipes module types.

SPipes Modules Utils

Contains developer tools for working with SPipes module types. Specifically:

SPipes Model

Defines Java model that is used for serialization of metadata about execution of pipelines. It is based on JOPA (Java OWL Persistence API) for accessing OWL ontologies, where those metadata are saved.

Development Environment Setup

The following software needs to be installed on the system for development:

Dockerization

Building the Docker Image

The Docker image of the SPipes backend can be built using the following command:

docker build -t s-pipes-engine .

Running the Docker Container

SPipes web can be run and exposed at port 8080 with the following command:

docker run -p 8080:8080 s-pipes-engine:latest

The endpoint will be available at http://localhost:8080/s-pipes

Configuration of SPipes scripts

By default, scripts are loaded from the directory /scripts within the filesystem of the s-pipes-engine image. The directory already contains the necessary definitions of reusable modules, so new scripts must be added to this directory to extend the scripts. All subdirectories of /scripts are searched recursively. A good practice is to mount local scripts to e.g., /scripts/root directory of the image, e.g.: docker run -v ./my-scripts:/scripts/root -p 8080:8080 s-pipes-engine:latest

Another option to configure scripts is to redefine where the SPipes engine searches the scripts using CONTEXTS_SCRIPTPATHS: docker run -e CONTEXTS_SCRIPTPATHS=/my/special/path -p 8080:8080 s-pipes-engine:latest

This is particularly useful when one would like to share the same path between the host filesystem and the docker image as explained in the following section.

Aligning file paths between the docker service and host system

For your SPipes script files, you can align file paths between Docker services and your host system using mounting. This allows a directory to be accessible from both Docker services and the host filesystem, ensuring that file paths remain the same. Consequently, you can copy an absolute path to a file from a Docker service and open it on the host filesystem, and vice versa.

For Linux, the typical path is /home, while for Windows, it is /host_mnt/c. When running Docker on Windows, Docker replaces C: with /host_mnt/c. When running Docker on Windows from within WSL distribution C: is accessible through /mnt/c.

To mount a directory from your host machine to the Docker container, use the following command:

Linux:

docker run -v /home:/home -p 8080:8080 s-pipes-engine:latest

Windows:

docker run -v /host_mnt/c:/host_mnt/c -p 8080:8080 s-pipes-engine:latest

Windows, but running inside WSL:

docker run -v /mnt/c:/mnt/c -p 8080:8080 s-pipes-engine:latest

Swagger

Swagger documents rest API. We can open Swagger UI with: SPIPES_URL/swagger-ui.html.

Licences of Reused software components

Besides included software dependencies by Maven, see a list of reused software components and their licenses.