json-ld / json-ld.org

JSON for Linked Data's documentation and playground site
https://json-ld.org/
Other
853 stars 152 forks source link

Modeling highly redundant subjects in JSON-LD #762

Closed fennibay closed 2 years ago

fennibay commented 2 years ago

I have multiple subjects in the data, that are not identical, but highly similar. I want to model the redundant part only once to optimize the size of the JSON-LD file.

Example

In Turtle:

@prefix : <urn:myurl/> .
@prefix ex: <http://example.org/> .

:subject1 ex:comProp1 ex:comValue1;
          ex:comProp2 ex:comValue2;
          ex:comProp3 ex:comValue3;
          # ...
          ex:comProp100 ex:comValue100;
          ex:difProp ex:difValue1.

:subject2 ex:comProp1 ex:comValue1;
          ex:comProp2 ex:comValue2;
          ex:comProp3 ex:comValue3;
          # ...
          ex:comProp100 ex:comValue100;
          ex:difProp ex:difValue2.

Example also available in JSON-LD Playground.

Solutions from other fields

One solution that works in RDF is to use owl constraints, but relies on a triple store with a reasoner and that's not always available. Also the syntax is cumbersome.

A way to express this using JSON-LD mechanisms instead of owl constraints is desirable for me, because that'd be implemented by a JSON-LD library, which needs much less compute power.

In Javascript I can just do:

const cmnprops = {comProp1: comValue1, comProp2: comValue2 /* etc. */}
const subject1 = {...cmnprops, difProp: difValue1}
const subject2 = {...cmnprops, difProp: difValue2}

Solution ideas for JSON-LD

I imagine something like this:

{
  "@context": {
    "my": "urn:myurl/",
    "ex": "http://example.org/",
    "cmnprops": {
        "ex:comProp1": {
            "@id": "ex:comValue1"
          },
          "ex:comProp2": {
            "@id": "ex:comValue2"
          },
          "ex:comProp3": {
            "@id": "ex:comValue3"
          }
    }
  },
  "@graph": [
    {
      "@id": "my:subject1",
      "@include/@import/@nest/...": "cmnprops",
      "ex:difProp": {
        "@id": "ex:difValue1"
      }
    },
    {
      "@id": "my:subject2",
      "@include/@import/@nest/...": "cmnprops",
      "ex:difProp": {
        "@id": "ex:difValue2"
      }
    }
  ]
}

OR something like this:

{
  "@context": {
    "my": "urn:myurl/",
    "ex": "http://example.org/"
  },
  "@graph": [
    {
        "@id": "_:cmnprops",
        "ex:comProp1": {
            "@id": "ex:comValue1"
          },
          "ex:comProp2": {
            "@id": "ex:comValue2"
          },
          "ex:comProp3": {
            "@id": "ex:comValue3"
          }
    },
    {
      "@id": "my:subject1",
      "@include/@import/@nest/...": "_:cmnprops",
      "ex:difProp": {
        "@id": "ex:difValue1"
      }
    },
    {
      "@id": "my:subject2",
      "@include/@import/@nest/...": "_:cmnprops",
      "ex:difProp": {
        "@id": "ex:difValue2"
      }
    }
  ]
}

Does JSON-LD offer a mechanism to address such a problem? Any hints would be appreciated.

Further details: https://stackoverflow.com/questions/67570340/modelling-highly-redundant-subjects-in-rdf

niklasl commented 2 years ago

You are correct in that there is no such mechanism in JSON-LD itself, nor is there for RDF in general (apart from through OWL reasoning as you noted).

There is however, a pair of vocabulary terms, rdfa:copy and rdfa:Pattern, defining this kind of "prototypical inheritance", along with an algorithm. However, this was specifically designed for RDFa 1.1 in HTML, known there as property copying (see also the primer for more examples).

A much more specific and much less formal term is also defined in Schema.org, called schema:model which relates several specific schema:Products to a schema:ProductModel whose properties they have in common.

Frankly, I wouldn't really expect for anything like this to be added to JSON-LD itself, mainly since it avoids generating new data by design. Also, the potential risk of data explosion (reminiscent of an XML bomb) requires quite some care in implementation.

(FWIW, here is a dead simple but incomplete implementation which at least should avoid too much explosion. You can use it on any expanded JSON-LD, it isn't tied to the RDFa syntax in any way (only to the vocabulary terms mentioned).)

gkellogg commented 2 years ago

This feature was considered in https://github.com/json-ld/json-ld.org/issues/650 and https://github.com/w3c/json-ld-syntax/issues/19. Ultimately, it was considered to be inconsistent with the general philosophy of JSON-LD. The RDFa version @niklasl eluded to using rdfa:copy was a response to the Microdata @itemRef capability. And, if including a reasoning pass, could be used in any RDF syntax, but is not generally done. Better might be to use SPARQL CONSTRUCT or UPDATE. You could also use JSON-LD framing with default values, potentially.

But, again as @niklasl discussed, it's not likely to be considered in the future for JSON-LD.

fennibay commented 2 years ago

Hi, many many thanks for the valuable resources and past discussions about this in JSON-LD.

@niklasl I didn't know about these RDFa mechanisms at all. As I understand these wouldn't require reasoners to work, but some kind of simple script (as the one you sent) to do the expansion as a form of simple reasoner. Is that correct?

I see then following ways:

Again many thanks.