KNowledgeOnWebScale / walder

Walder offers an easy way to set up a website or Web API on top of decentralized knowledge graphs.
MIT License
63 stars 9 forks source link
knowledge-graph linked-data rdf solid web-api website

logo of Walder

Walder offers an easy way to set up a website or Web API on top of decentralized knowledge graphs. Knowledge graphs incorporate data together with the meaning of that data. This makes it possible to combine data from multiple knowledge graphs, even if different, independent parties maintain or host them. Knowledge graphs can be hosted via Solid PODs, SPARQL endpoints, Triple Pattern Fragments interfaces, RDF files, and so on. Using content negotiation, Walder makes the data in these knowledge graphs available to clients via HTML, RDF, and JSON-LD. Users define in a configuration file which data Walder uses and how it processes this data. Find out which APIs are built with Walder here.

Table of contents

Installation

Install Walder globally via yarn global add walder.

For development, follow these steps:

  1. Clone the repository.
  2. Install dependencies via yarn install.
  3. Install Walder globally via $ yarn global add file:$(pwd).

Usage

Walder is available as a CLI and JavaScript library.

CLI

Usage: walder [options]

Options:
  -v, --version              output the version number
  -c, --config <configFile>  YAML configuration file input
  -p, --port [portNumber]    server port number (default: 3000)
  -l, --log [level]          enable logging and set logging level (one of [error, warn, info, verbose, debug]) (default: "info")
  --no-cache                 disable Comunica default caching
  --lenient                  turn Comunica errors on invalid data into warnings
  -h, --help                 output usage information

Library

// From the root directory
const Walder = require('.');

const configFilePath = 'path/to/configfile';
const port = 9000; // Defaults to 3000 
const logging = 'info'; // Defaults to 'info' 
const cache = false;  // Defaults to true
const lenient = true; // Defaults to false 

const walder = new Walder(configFilePath, {port, logging, cache, lenient, cwd});

walder.activate();    // Starts the server
walder.deactivate();  // Stops the server

Config file structure

You write a config file in YAML following the OpenAPI 3.0 specification. The config file must have the following structure:

openapi: 3.0.2
info:  # OpenAPI metadata
  title: 'Example site'
  version: 0.1.0
x-walder-resources:  # Directories used by Walder - OPTIONAL
  root:  # Path to the root folder of the directories used by Walder (absolute or relative to the directory containing the config file) - OPTIONAL (default: .)
  views:  # Path to directory containing view template files (absolute or relative to the root folder) - OPTIONAL (default: views)
  pipe-modules:  # Path to directory containing local pipe modules (absolute or relative to the root folder) - OPTIONAL (default: pipe-modules)
  public:  # Path to directory containing all files that should be available statically (e.g. stylesheets) (absolute or relative to the root folder) - OPTIONAL (default: public)
  layouts: # Path to directory containing all layout template files that can be used by view template files (absolute or relative to the root folder) - OPTIONAL (default: layouts)
x-walder-datasources:  # Default list of data sources
  - ...  # E.g. link to SPARQL endpoint or a GraphQL-LD query
paths:  # List of path entries
  path-entry-1:
    ...
  path-entry-2:
    ...
x-walder-errors: # Default error page views - status codes with files containing the HTML view template (absolute path or relative to the views directory)
  404: ...
  500: ...
  ...

Resources

The x-walder-resources key of the config file contains paths to directories used by Walder. This key and it's values are optional. If a user does not give paths, Walder uses the following default values relative to the directory of the config file.

root: .
views: views
pipe-modules: pipe-modules
public: public
layouts: layouts

To prevent Walder from making the wrong files public, when a user does not give a path to the public field, Walder creates a new directory public if it does not find this directory in the current working directory and uses that one.

Path entry

A path entry defines a route and has the following structure:

path:  # The path linked to this query
  request:  # The HTTP request type (GET, POST, etc.)
    summary: ...  # Short description
     parameters:  # Path variables/Query parameters
        - in: ...  # 'path' or 'query'
          name: ...  # Name of the parameter
          schema:
            type: ... # Type of the parameter
          description: ...  # Description of the parameter
    x-walder-query:
      graphql-query: ...  # One or more GraphQL queries
      json-ld-context: ...  # The JSON-LD corresponding to the GraphQL query
      sparql-query: ... # One or more SPARQL queries
      json-ld-frame: ... # A JSON-LD frame that should be applied to the result of a SPARQL query 
      options: # Global options that will be applied to all the graphql-queries of this path (OPTIONAL)
      datasources:  # Query specific datasources (OPTIONAL)
        additional: ...  # Boolean stating that the following datasources are meant to be used on top of the default ones
        sources:  # List of query specific datasources
          - ...  # E.g. link to SPARQL endpoint
    x-walder-postprocessing:  # The (list of) pipe modules used for postprocessing
      module-id:  # Identifier of the pipe module
        source: ...  # Path leading to source code of the pipe module (absolute path or relative to the pipe-modules directory)
        parameters: # the parameters for the pipe module (OPTIONAL)
          - _data # (DEFAULT) this gives all the data, but all paths in the data object are supported (e.g. _data.0.employee)
          - ... # Additional parameters if your function supports those (OPTIONAL)
    responses:  # Status codes with files containing the HTML view template (absolute path or relative to the views directory)
      200: ...  # (REQUIRED)
      500: ...  # (OPTIONAL)

Example

The following command starts a server on port 3000 (default) using an example config file.

$ walder -c example/config.yaml

This will start a server on localhost:3000 with the following routes:

Options for GraphQL-LD queries

In the path entry above, the user defined options as a global (optional) identifier that Walder uses for every query of that path. We have two options where we can choose from: sort and remove-duplicates. With given syntax:

options:
  sort: # Enable sorting on the data (OPTIONAL)
    object: # JSONPath to the object you want to sort for
    selectors: # The values inside the object over which you want to sort
      - ... # The default option when you want ascending order, just give the value (JSONPath notation supported for further nesting)
      - value: ...  # When you want descending order, specify the value/order (JSONPath notation supported for further nesting)
        order: desc
  remove-duplicates: # Enable the removal of duplicates of the data (OPTIONAL)
    object: ... # The JSONPath tot the object that you want to compare
    value: ... # The value that has to be compared to determine whether it's duplicate (JSONPath notation is also supported for further nesting)

If you do not want options to be global for the whole path, you can define options per query.

path:  # The path linked to this query
  request:  # The HTTP request type (GET, POST, etc.)
    summary: ...  # Short description
     parameters:  # Path variables/Query parameters
        - in: ...  # 'path' or 'query'
          name: ...  # Name of the parameter
          schema:
            type: ... # Type of the parameter
          description: ...  # Description of the parameter
    x-walder-query:
      graphql-query: ...  # One or more GraphQL queries
        name:
          query: ... # The GraphQL query
          options: # options that will be applied only to this specific graphql-query (OPTIONAL)
...

The following command starts a server using this config file.

$ walder -c example/config-sorting-duplicates.yaml

This will start a server on localhost:3000 with the following routes:

Multiple config files

You can split a config file in multiple files, using the $ref keyword. We follow the OpenAPI 3.0 spec that explains how to use the referencing.

When first referenced you need to use the path beginning from the directory of the config file, but if the referenced file has references itself, it can use paths relative to its own location, as shown below.

The actual config file referencing its paths

openapi: 3.0.2
info:
  title: 'Example site'
  version: 0.1.0
x-walder-resources:
  path: ./
  views: views
  pipe-modules: pipeModules
  public: public
x-walder-datasources:
  - http://fragments.dbpedia.org/2016-04/en
paths:
  /music/{musician}:
    $ref: './paths/music_musician.yaml'
  /movies/{actor}:
    $ref: './paths/movies_actor.yaml'
x-walder-errors:
  404:
    description: page not found error
    x-walder-input-text/html: error404.html
  500:
    description: internal server error
    x-walder-input-text/html: error500.html

Below you see ./example/paths/movies_actor.yaml with reference with path relative to its own location

get:
  summary: Returns a list of the all movies the given actor stars in
  parameters:
    - in: path
      name: actor
      required: true
      schema:
        type: string
      description: The target actor
  x-walder-query:
    $ref: '../walderQueryInfo/movies_actor_info.yaml'
  responses:
    200:
      description: list of movies
      x-walder-input-text/html: movies.pug

Content negotiation

Using content negotiation, Walder makes the following output formats available:

RDF

Walder uses graphql-ld-comunica to execute the GraphQL queries and @comunica/query-sparql to execute SPARQL queries. The result of a GraphQL-LD query is a JSON data. Walder first converts it into JSON-LD. This enables conversion to other RDF formats during content negotiation. The result of a SPARQL query is an array of quads. If a JSON-LD frame is specified, the quads are converted to JSON-LD. Due to the importance of content negotiation, only CONSTRUCT queries are supported.

HTML templates

Walder uses consolidate to automatically retrieve the corresponding engine for a given template. This means that the supported template engines are dependent on consolidate.

You can use different template engines for different routes, e.g., pug renders one route's HTML, while handlebars renders another route's HTML. Walder does this all by looking at the file extension of the given template.

Templates can be used in views as well as in layouts. So we'll name them view templates and layout templates in order to distinguish.

Accessing query results in view templates

The results of the queries, specified in the configuration file for a route, are available for rendering in view templates as data.

In the case of a GraphQL-LD query, each object will be an array, unless the query was singularized. songs.handlebars is an example of the consumption of the result of the single query in the route /music/{musician} in this configuration file. songs_movies.handlebars is an example of the consumption of the results of the two queries in the route /artist/{artist} in this configuration file.

In the case of a SPARQL query, each object is an array of quads if no JSON-LD frame is specified. Else it will be a JSON-LD object.

Using layouts in view templates

Using layouts is a great way to avoid repetition in route-specific view templates. Reusable HTML structures such as headers, footers, navigation bars and other contents, meant to appear in multiple routes, are preferable specified in layout files.

A layout template file can be specified in a view template file, by means of front-matter metadata field layout. It should contain a filename, available at the layouts location defined in the configuration file. It may contain a relative path in front of the filename.

Example view template file specifying a layout:

---
layout: my-layout.pug
---
// view template continues here

Walder puts the inner HTML contents generated from the view template file into the data forwarded to the layout template file as an object named content.

The layout template file is yet another template. It usually expands these inner HTML contents at the position of its choice.

A simple pug example (mind the !{content}):

doctype html
html(lang="en")
    head
        title I'm based on a layout
    body !{content}

Accessing front-matter metadata in view templates and layout templates

In addition to query results, Walder adds front-matter metadata, specified in view templates, as additional attributes to the data.

Each additional attribute's name is equal to the metadata field name provided. The following metadata field are reserved: layout, content, data, and the names assigned to queries in routes having multiple queries (see above).

These attributes are available to the view template and to the layout template it refers to, if any.

Example view template file specifying a front-matter metadata field and reading that field (mind the #{a1}):

--
a1: Value for FrontMatter attribute a1!
---

doctype html
html(lang="en")
    body
        main a1: #{a1}

Example view template file, specifying a layout template and another front-matter metadata field:

---
layout: layout-fm.pug
a2: Value for FrontMatter attribute a2!
---

main Lorem ipsum

Example corresponding layout template (layout-fm.pug) reading that field (mind the #{a2}):

doctype html
html(lang="en")
    head
        if a2
            title #{a2}
    body !{content}

Input validation

While parsing the config file, Walder also validates the correctness and completeness of the input. When Walder has parsed the whole config file and found errors, Walder returns all errors and exits.

At the moment, Walder validates the following:

Error handling

Walder binds error pages to a certain HTTP status code. You can define default error pages, but also path specific error pages by adding them to the responses key in the corresponding path entry.

Errors

Global

Pipe modules

GraphQL-LD

Example

When you run Walder using the following command:

$ walder -c example/config-errors.yaml

the following paths lead to errors:

The following config file excerpt will use the path specific moviesServerError.handlebars view template on errors leading to status code 500 when navigating to /movies.

When the required query parameter actor is not passed, Walder returns the status code 404. Walder will use the default error404.html file since the config file has no path-specific HTML view template for the corresponding status.

...
paths:
  /movies:
    get:
      summary: Returns a list of the all movies the given actor stars in
      parameters:
        - in: query
          name: actor
          schema:
            type: string
            minimum: 0
          description: The actor from whom the movies are requested
          required: true
      x-walder-query:
        graphql-query: >
          {
            id @single
            ... on Film {
              starring(label: $actor) @single
            }
          }
        json-ld-context: >
          {
            "@context": {
              "Film": "http://dbpedia.org/ontology/Film",
              "label": { "@id": "http://www.w3.org/2000/01/rdf-schema#label", "@language": "en" },
              "starring": "http://dbpedia.org/ontology/starring"
            }
          }
      responses:
        200:
          description: list of movies
          x-walder-input-text/html: movies.pug
        500:
          description: internal movie server error
          x-walder-input-text/html: moviesServerError.handlebars
x-walder-errors:
  404:
    description: page not found error
    x-walder-input-text/html: error404.html
  500:
    description: internal server error
    x-walder-input-text/html: error500.html

Developing your website

Whilst developing your website, you probably want your website to reload while making changes to config.yaml. You can easily do this using npm-watch. See the package.json snippet below on how to start

{
  "watch": {
    "run": "config.yaml"
  },
  "scripts": {
    "run": "walder -c config.yaml --no-cache",
    "watch": "npm-watch"
  },
  "dependencies": {
    "walder": "^2.0.1"
  },
  "devDependencies": {
    "npm-watch": "^0.7.0"
  }
}

Run npm run watch and Walder reloads every config.yaml change!

Dependencies

Library License
@comunica MIT
accepts MIT
axios MIT
chai MIT
commander MIT
consolidate MIT
debug MIT
express MIT
front-matter MIT
fs-extra MIT
graphql-ld MIT
graphql-ld-comunica MIT
handlebars MIT
is-html MIT
jade-to-handlebars MIT
json-refs MIT
jsonld BSD-3-Clause
jsonpath MIT
markdown-it MIT
mocha MIT
morgan MIT
n3 MIT
object-path MIT
pug MIT
supertest MIT
tmp MIT
winston MIT
yaml ISC

Tests

Built with Walder

Did you build something Walder and want to add it to the list? Please create a pull request!

License

This code is copyrighted ©2019–2020 by Ghent University – imec and released under the MIT license.