zazuko / xrm

A friendly language for mappings to RDF
MIT License
1 stars 0 forks source link

Support for creating JSON-LD context #21

Open ktk opened 4 years ago

ktk commented 4 years ago

While RML supports mapping from JSON as well by using JSON selectors we mostly use JSON-LD context for that. The big benefit is that a JSON-LD parser is all we need and that's pretty common by now.

Example, given this input:

{
   "version":"1.0",
   "timestamp":1548925887242,
   "eventType":"INGESTION_LOAD",
   "source":"EE_DE",
   "sourceView":"Party",
   "record":{
      "dateOfBirth":"29APR1940:00:00:00",
      "naturalPersonId":"f0096f7e-a423-11e0-8142-530f6a1c02e3",
      "householdRole":"Vorstand",
      "firstName":"Ursula",
      "id":"f0095fd4-a423-11e0-8142-530f6a1c02e3",
      "maritalStatus":"unbekannt",
      "flagDeceased":"N",
      "personName":"Krumpl",
      "gender":"weiblich",
      "householdId":"f0096f4c-a423-11e0-8142-530sdfsdf02e3"
   }
}

With the following JSON-LD context we would already get useful RDF out of it:

{
   "@context":{
      "@vocab":"http://ontologies.example.org/core/",
      "@base":"http://data.example.org/id/party/",
      "dateOfBirth":"http://schema.org/birthDate",
      "personName":"hasLastName",
      "firstName":"hasFirstName",
      "gender":"definesGenderOf",
      "naturalPersonId":"@id"
   }
... data 
}

Note that the only missing thing in this example are classes, they seem to be at a strange position in JSON-LD, it would be part of the ... data part:

  "@type": "NaturalPerson"
ktk commented 4 years ago

(text taken from some documentation for a customer, ignore the RDF intro ;)

Minimal JSON-LD

A minimal JSON-LD context file would have to define:

"@vocab": "http://ontologies.example.org/core/",
"@base":"http://data.example.org/id/SOMEID/"
"somevaluesId": "@id"

The first line would map all keys to the namespace defined in @vocab while the second line defines the @base for URIs of instances of this data. This is combined with the value of "somevaluesId"in the data, adjust that to whatever is defined as URI key. Typically this should be a unique identifier that cannot be found more than once in all of the data. UUIDs or auto-increments in SQL are a good base for something like this. Note that it also makes sense to define another @base for each "type" of data so you can increase the chance that there are no clashes with other IDs (SOMEID in this example).

Point to external vocabularies

Replacing existing keys with something completely else can be done like this:

"dateOfBirth": "http://schema.org/birthDate"

Instead of using the @vocab prefix for this key the URI would be replaced by the one of the value. This makes sense when you want to point to external, existing vocabularies. This is common and recommended in Linked Data to increase re-use of data and vocabularies.

Add datatypes to literals

By default, all values of a key get mapped to a so called literal, often also called a "string literal". Literals can also have a datatype, this makes sense for things like dates, for example a birthDate:

"dateOfBirth": {
  "@id": "http://schema.org/birthDate",
  "@type": "http://www.w3.org/2001/XMLSchema#date"
},

In this example the key dateOfBirth would be mapped to the external property URI http://schema.org/birthDate and would be represented as an xsd:date, see the RDF 1.1 specification for more details. Note that it is your responsibility to make sure the datatype contains the correct format! xsd:date for example expects Dates (yyyy-mm-dd) with or without timezone.

Point to other resources using URIs as objects

Linking resources in RDF is very easy: All we have to do is to point to another URI as object, instead of a plain literal. This can be done easily in JSON-LD as well:

"householdId": {
  "@type": "@id"
}

In this example the key householdId will be handled as an "@id", which leads to a URI as object instead of a simple string literal. Note that this structure with {} is required in this context.