weso / shex-lite

Scala implementation of a compiler for a subset of the Shape Expressions Compact Syntax.
Other
8 stars 9 forks source link

[SLI-0121] Generate java objects from shex-lite expressions #121

Open thewillyhuman opened 4 years ago

thewillyhuman commented 4 years ago

logos_feder

Introduction

In the framework of the Hercules - ASIO project, the need arises to transform a model described through an ontology into a Java object model. For this purpose it is proposed to add to the shex-lite system the ability to generate these java objects through a code generation module specialized in Java.

Motivation

Within the Hercules - ASIO project, an ontology is used to give semantic value to the instances of research data from the different Spanish universities. A very simple way to validate that the intances comply with the semantic schema provided by the ontology is to use shape expressions.

Without in order to model the ontology in an object-oriented programming language like java, it is necessary to create a technology that does not currently exist.

For this reason, it is proposed to collaborate with the shex-lite system to create a compiler extension that is responsible for generating java code from shex-lite expressions.

Proposed solution

Shex-Lite is a compiler of a reduced version of shapes expressions that allows to cover with 20% of the functionalities almost 80% of the use cases for which shex expressions are used by users.

Among the functionalities of this compiler is included the generation of intermediate representations of shapes expressions so that a schema defined by several shapes expressions can be transformed into one or more documents that represent the shape expression in another language such as html, scala, or pyhton.

In any case, for this proposal the objective is to transform a shape expression like the following:

asio:Person {
  asio:name  xsd:string   ;  # Means that the person has a name.
  asio:knows @:Person * ; # Means that the user knows other users, without specify the number.
}

Transform into the next java class:

package asio;

public class Person {
  private String name;  // The name of the user.
  private List<Person> knows;  // The people this person knows.

  public Person(String name, List<Person> knows) {
    this.name = name;
    this.knows = knows;
  }

  // Getters and Setters here...
}

And in the same way, if the schema defined more than one shape expression in a single document, the system should generate the different java classes in different files and handle the import relationships between them automatically.

In addition to the project specification, the use license under which this functionality is developed is required to be GNU-GPLv3, which is perfectly with ShEx-Lite since it is native compatible with MIT and GNU-GPLv3 licenses.

Alternatives considered

The alternatives considered include implementing the java code generation system in the original shex system, however this option is downloaded quickly since the power provided by the original form expression system is associated with a complexity that would greatly hinder the development of the code generation module.

Another alternative considered is to perform this transformation from ontology to java classes by hand, however this option is also discarded since ontology is a living entity that is constantly being modified and this affects how objects are generated, apart from the number of classes that the ontology contains make this alternative almost unfeasible.

About Hércules - ASIO Project

El proyecto HÉRCULES-Semántica de Datos de Investigación de Universidades tiene un presupuesto de Cinco Millones Cuatrocientos Sesenta y Dos Mil Seiscientos euros con una cofinanciación FEDER de un 80%, por tanto el Fondo Europeo de Desarrollo Regional (FEDER), a través del entonces Ministerio de Economía, Industria y Competitividad (actualmente ministerio de Ciencia e Innovación ) como Organismo Intermedio del Programa Operativo Crecimiento Inteligente del FEDER– POCint (ahora Programa Operativo Plurirregional de España – POPE) realiza una aportación de Cuatro Millones Trescientos Setenta Mil Ochenta euros.

Automatic translation: The HERCULES-Semantics of University Research Data project has a budget of Five Million Four Hundred Sixty Two Thousand Six Hundred Euros with an ERDF co-financing of 80%, therefore the European Regional Development Fund (ERDF), through the then Ministry of Economy, Industry and Competitiveness (currently the Ministry of Science and Innovation) as the Intermediate Body of the ERDF Smart Growth Operational Program - POCint (now the Multi-regional Operational Program of Spain - POPE) makes a contribution of Four Million Three Hundred Seventy Thousand Eighty euros.

thewillyhuman commented 4 years ago

This proposal will be implemented in [SLI-0104] ShEx-Lite 3.0 beta planning (goal: 2020-08).

Congrats 🎉!!