VladimirAlexiev / rdf2rml

RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation
39 stars 5 forks source link
graphviz plantuml r2rml r2rml-mapping rdf2rml rdfpuml uml-diagram visualization

+OPTIONS: ':nil *:t -:t ::t <:t H:5 \n:nil ^:{} arch:headline author:t broken-links:nil

+OPTIONS: c:nil creator:nil d:(not "LOGBOOK") date:t e:t email:nil f:t inline:t num:nil

+OPTIONS: p:nil pri:nil prop:nil stat:t tags:t tasks:t tex:t timestamp:nil title:t toc:5

+OPTIONS: todo:t |:t

+OPTIONS: html-link-use-abs-url:nil html-postamble:auto html-preamble:t html-scripts:t

+OPTIONS: html-style:t html5-fancy:nil tex:nil

+STARTUP: nonum

+TITLE: RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation

+DATE: <2022-09-20>

+AUTHOR: Vladimir Alexiev

+EMAIL: vladimir.alexiev@ontotext.com

+LANGUAGE: en

+CREATOR: Emacs 25.3.1 (Org mode 9.1.13)

+TODO: TODO INPROGRESS | DONE CANCELED

+HTML_DOCTYPE: xhtml-strict

+HTML_CONTAINER: div

+DESCRIPTION:

+KEYWORDS: RDF, visualization, PlantUML, R2RML, generation, model-driven, RDF by Example, rdfpuml, rdf2rml, rdf2sparql, rdf2tarql, rdf2ontorefine

:CONTENTS:

RDF is a graph data model, so the best way to understand RDF data schemas (ontologies, application profiles, RDF shapes) is with a diagram. Many RDF visualization tools exist, but they either focus on large graphs (where the details are not easily visible), or the visualization results are not satisfactory, or manual tweaking of the diagrams is required.

If the example instances include embedded source field names, they can describe a mapping precisely. I've implemented a few more tools to generate transformations:

See http://twitter.com/hashtag/rdfpuml for news, diagrams and announcements.

** License and Citation This work is covered by the [[https://www.perlfoundation.org/artistic-license-20.html][Artistic-2.0]] license.

If you use this software, please cite it as shown above.

** Documentation

** Related Work

The following works use or mention this software:

** Docker Image If you prefer to work with Docker so you don't need to install software manually, you can use this [[https://docker-registry.ontotext.com/#browse/search=keyword%3Drdf2rml][rdf2rml image]] from the public Nexus (Docker Registry) of Ontotext. To run it, use:

: docker run -v :/files --rm docker-registry.ontotext.com/rdf2rml:latest`

Where ~~ is the local directory holding your ~.ttl~ files. It was made on 31 May 2023 and uses the following versions:

Note: [[https://github.com/VladimirAlexiev/rdf2rml/pull/7][pull request 7]] of 17 Sep 2019 by Jem Rayfield (~@jazzyray~) dockerizes the installation, and makes extra changes related to input/output and configuration. However, it has not been merged yet

** Near-term

*** Modularize and Package Better

*** Regression Tests

*** rdf2rml: disentangle inverse edge In the case ~Y-P-X~ described above:

*** Release on CPAN

*** Add Unicode tests Add ttl with non-ASCII chars: Accented, Cyrillic, French, etc.

* Prefixes Allow specifying the prefixes file See https://github.com/VladimirAlexiev/rdf2rml/pull/7 Eliminate Curie.pm [[./lib/RDF/Prefixes/Curie.pm]] remembers ~@base~ and uses that for URL shortening. Once [[https://github.com/kasei/perlrdf/issues/131][perlrdf#131]] is fixed, eliminate this dependency (local module) *** Remember prefixes from input file ~rdfpuml~ shortens URLs using prefixes only from ~prefixes.ttl~, but should also use prefixes defined in the individual input file. Support more RDF Formats Now it only supports Turtle, because it concatenates ~prefixes.ttl~ to the main file. If it can collect all prefixes from RDF files, such concatenation won't be needed

*** Batch Processing Issue [[https://github.com/VladimirAlexiev/rdf2rml/issues/1][#1]]: plantuml is slow to start up, so we'd like to process a bunch of ~puml~ files at once. The best way is to have a smarter script or ~Makefile~ that uses the following http://plantuml.com/command-line features:

**** "Manual" Batching Before I discovered the ~-checkmetadata~ option, I had the idea that ~rdfpuml~ could put several diagrams in one ~puml~ file:

+BEGIN_EXAMPLE

@startuml file1.png

made from file1.ttl

@enduml @startuml file2.png

made from file2.ttl

@enduml

+END_EXAMPLE

However, this interferes with ~make~ processing that regenerates only ~png~ for changed ~ttl~ files, and makes things less modular overall.

** Mid-Term

*** Upgrade to use Attean [[https://github.com/kasei/perlrdf][Trine (Perl RDF)]] is end of life. [[https://github.com/kasei/attean][Attean]] is the new generation

*** Integrate in Emacs ~org-mode~ Write Turtle, see diagram (easy to do)

*** Node colors, icons, tooltips See [[./ideas]]

*** More arrow types and styles

[[./ideas/arrows.png]] [[./ideas/arrows-2.png]]

*** Extra Layout Options Local layout options are described in [[http://wiki.plantuml.net/site/class-diagram#help_on_layout][Help on Layout]]:

Global options include (eg see [[http://www.plantuml.com/plantuml/uml/bP8nQmCn38Lt_mfnoq7XGZgrGoYXMJeqIpfqTkwKdeXi7xRI4kYFBvSORCSGg8OGdlJfFPbR1z5UJePLsuuq8FJaUqPr-OzcaZCOD7lq8PUqYAVzIJ2eS2GxQQyDC5cKyuJWl8mkQuHH3-w7x1SSD0TKRMfjoMvOX_19WupmjCnxrWqOS8BdGlNQ7gEg55b1Vz0zmlOIyfs2e4LVDNQECHFVDFC7-c_giHfLgct18siXPmEqhL8R9hG2LNNTIodaUyj4QMRrs-N8TNTbqJmsLuleq2mNYuS6ydDKvXQfsY66kacJzdM5NnoUVnAVtzj16MVdd56pK3350IMmSLQyOyOXldQTB8AhsIsl2arl0RVtH_G-MK2HlC_DvwPsdXN-mQMw-NxYzBruXT6hauYiqGudmty0][this diagram]]):

+begin_plantuml

skinparam Linetype ortho skinparam NodeSep 80 skinparam RankSep 80 skinparam Padding 5 skinparam MinClassWidth 40 skinparam SameClassWidth true

+end_plantuml

And there are a lot more undocumented features: https://forum.plantuml.net/7095

*** Custom Reification Ability to describe custom reification situations using the Property Reification Vocabulary (PRV)

*** Use MindMap/WBS for Hierarchies Plantuml now has [[http://plantuml.com/mindmap-diagram][MindMap]] and [[http://plantuml.com/wbs-diagram][WBS (or OBS)]] diagrams that use a simple bulleted syntax to draw hierarchies.

It would be nice to use this to draw hierarchies of individuals, in particular taxonomies.

Here are examples of the two styles:

** Long-Term *** rdf2soml to Generate Semantic Object Models A new tool ~rdf2soml~ to generate Ontotext Platform SOML from RDF examples.

What's missing? Most importantly: property cardinality and virtual inverses.

PlantUML can show arrow cardinalities, and this simple and natural [[http://www.plantuml.com/plantuml/uml/SoWkIImgAStDuSh8J4bLICuiIiv9XR1JSmjAAXLoKqioybEAaOKIIqgACfDAIrABkI8Kb0oi39KKT7DIqqfqxHIK3ArobHGY5QmK2eho2_HZyZBpoWA0B2w7rBmKe2q0][PlantUML code]]:

+BEGIN_SRC plantuml

X "0:1" -left-> "1:m" Y : prop/\ninvProp

+END_SRC

Is depicted as follows:

[[./ideas/cardinality-and-inverse.png]]

We have two options how to express this in triples:

*** Cardinality With RDF

+BEGIN_SRC turtle

model triples

:X :prop :Y.

puml triples

<< :X :prop :Y >> puml:arrow puml:left; # direction puml:min 1; puml:max puml:inf; # cardinality puml:inverseAlias [puml:min 0; puml:max 1; puml:name "invProp"]. # virtual inverse

+END_SRC

**** Cardinality With Blank Node

+BEGIN_SRC turtle

model triples

:X :prop :Y.

puml triples

:X puml:left :Y. # direction :X :prop [ # a puml:Cardinality; # may need this marker class to skip the node from the diagram puml:min 1; puml:max puml:inf; # cardinality puml:object :Y; # only needed if X has several relations "prop" and they need different annotations puml:inverseAlias [puml:min 0; puml:max 1; puml:name "invProp"] # virtual inverse ].

+END_SRC

rdf2shape to Describe & Generate RDF Shapes Visualize RDF Shapes (SHACL and ShEx) Issue [[https://github.com/VladimirAlexiev/rdf2rml/issues/8][#8]]: discussion with Thomas Francart of Sparna

I developed this SHACL to PlantUML converter, in Java, based on TopQuadrant SHACL lib, and the result is at https://shacl-play.sparna.fr/play/draw and code at https://github.com/sparna-git/shacl-play/tree/master/shacl-diagram

I don't have a strong opinion on the example you provide, an alternative idea that comes to my mind is

+begin_src turtle

:node1 :link [ rdf:value :node2; puml:min 1 ; puml:max 2 ; ]

+end_src

But this changes the structure of the example graph itself, which might not be convenient

*** Generate transformations for other than relational sources R2RML works great for RDBMS, but how about other sources? Extend rdf2rml to generate: