Mapping plain JSON to JSON-LD with LinkML

bartkl commented 2 days ago

Hi all,

I wanted to share something I've played around with that could be of interest to this project.

The vast majority of REST APIs yield JSON responses. It would be great if all of that JSON could be semantically enriched to become JSON-LD instance data. However, convincing all those project managers, architects and devs to upgrade their APIs to leverage Semantic Web technology is not feasible. Even if you get a green light, it is notoriously difficult to have devs interested and invested enough to learn how to do this well.

So I asked myself: given this pessimistic scenario, is there any way I could create JSON-LD from the existing JSON while bothering the least amount of peope? Turns out: yes, there is!

Now the obvious first candidate is to create a context file. Although this maps each local key name to a URI nicely, this method isn't able to add the highly valuable @type information to the instance data.

Turns out LinkML offers some nice capabilities here.

First off, we can generate JSON-LD context files from a LinkML schema. But we can do better. Using linkml-convert we can provide instance data in JSON, and have LinkML turn it into JSON-LD including @type additions. We can in fact choose to serialize to other LD formats such as Turtle as well.

Anway, I've given this a go with a small test data set and got it to work. It was a minimal and simple test set, and already there were some rough adges (as is often the case with LinkML tooling), but it looks very promising. At the very least this method works can work :).

Curious to hear your thoughts as to whether this can play a useful role in the context of this project.

Kind regards, Bart

admin-cimug commented 2 days ago

@bartkl : this has been on my radar with plans for inclusion in the IEC62361-104 spec that has completed review by P-members and pending release as a CDV once this topic has been drafted in a pending clause. I'd love to talk offline together to catch you up and compare our efforts and how to align. I'm traveling but back on Thursday but would love to set up a call. Drop me an email to jump start that 😄.

(P.S. the thought is after we touch base to bring to the group for discussion on one of our Friday semantics calls. Penny for your thoughts.)

VladimirAlexiev commented 1 day ago

Hi @bartkl ! I agree in general with your comments, but the situation with CIM is different...

AFAIK, there are not existing large volumes of CIM JSON data: so data creators can/should comply with the CIM JSON-LD spec
I've made a comprehensive context, see https://github.com/Sveino/Inst4CIM-KG/tree/develop/rdf-improved#json-ld-context. It defines prop types
CIM has strict domain and range declarations. So if you have some CIM data without types, you can use rdfs domain/range reasoning to add types

I wondered whether each class has some "characteristic" props that are unique to it. That is indeed the case: each of 927 CIM classes has some unique incoming and outgoing props (this query finds nothing). The reason is that :

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX cim: <https://cim.ucaiug.io/ns#>
select * {
  ?x a owl:Class
  filter (
        not exists {?p1 rdfs:domain ?class} ||
        not exists {?p2 rdfs:range ?class} )
}

The reason is that CIM props are overspecified. Eg out of all props of AssessedElementWithRemedialAction, at least these are inherited but overspecified to that class only: nc:AssessedElementWithRemedialAction.mRID, nc:AssessedElementWithRemedialAction.enabled, nc:AssessedElementWithRemedialAction.normalEnabled:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX cim: <https://cim.ucaiug.io/ns#>
PREFIX nc: <https://cim4.eu/ns/nc#>
select * {
  {?outgoing rdfs:domain nc:AssessedElementWithRemedialAction} union
  {?incoming rdfs:range nc:AssessedElementWithRemedialAction}
}

But still: afaik, all existing CIM instances have explicit rdf:type.

Sveino commented 1 day ago

@VladimirAlexiev We have a range of CIM standards, the full 61968-series with exception of 61968-13, are message/document based rather than graph. It makes sense that they should support plain JSON, but should be linked to JSON-LD context. I would, however, prefer that we solve this seen from the semantic side. This is something we have discussed in regards to the standard that Todd refers to.

VladimirAlexiev commented 1 day ago

support plain JSON, but should be linked to JSON-LD context.

JSON-LD supports this: https://www.w3.org/TR/json-ld/#interpreting-json-as-json-ld . It uses HTTP headers to attach the context during transfer, so you don't need it in the file. Adding labels spec, jsonld.

bartkl commented 16 hours ago

@VladimirAlexiev sharp points! Especially about the possibility to infer the domain/range using reasoning.

Yes, I'm aware context files can be external and be obtained through setting the appropriate HTTP header. It's beautiful!

To make sure it is clear what I was referring to with regards to the limitations of JSON-LD contexts and adding type information, let me quote the spec:

The ability to coerce a value using a term definition is distinct from setting one or more types on a node object, as the former does not result in new data being added to the graph, while the latter manages node types through adding additional relationships to the graph.

The old (1.0) spec was a bit more emphatic in its articulation:

Specifically, @type cannot be used in a context to define a node's type

That's to say: you can coerce value to be of a certain data type or an IRI, but the context is fundamentally limited to not be able to add statements to the data, which includes adding "@type": "cim:ACLineSegment" within a node.

Anyway, perhaps you already understood my point. You certainly addressed how to deal with this nicely though, especially using reasoning capabilities.

Sveino / Inst4CIM-KG

Mapping plain JSON to JSON-LD with LinkML #136