json-ld / json-ld-star

CG Note on JSON-LD*
https://json-ld.github.io/json-ld-star
Other
24 stars 7 forks source link

allow `"@annotation": "<IRI>"` in term definition #27

Open pchampin opened 2 years ago

pchampin commented 2 years ago

The rationale is the following (adapted from this message):

{
    "@context": [
        "https://schema.org/",
        { "ex": "http://example.com/" }
    ],
    "@id": "ex:bob",
    "ex:captain": {
        "@id": "ex:bowls_club",
        "@annotation": {
            "ex:realization": {
                "@type": "Event",
                "startDate": "01-01-2021",
                "endDate": "31-12-2021"
            }
        }
    }
}

Since "ex:realization" is expected to be used exclusively as an annotation, one might want to write:

{
    "@context": [
        "https://schema.org/",
        { "ex": "http://example.com/",
          "realization": { "@annotation": "ex:realization" } }
    ],
    "@id": "ex:bob",
    "ex:captain": {
        "@id": "ex:bowls_club",
        "realization": {
            "@type": "Event",
            "startDate": "01-01-2021",
            "endDate": "31-12-2021"
        }
    }
}

This would be homogeneous to how @rev works in terms definition.

edited: the initial proposal (and title) was a mistake (and there was a bug in the 2nd example)

niklasl commented 1 year ago

I think this is a good idea, as it allows for "retrofitting" JSON terms that are not about the object into annotation terms.

At the National Library of Sweden we have come across cases where provenance information has been conflated within blank node structures. A minimal abstract example would be:

{
  "@id": "a",
  "classification": {
    "@type": "Classification",
    "code": "X",
    "generatedBy": "some-process"
  }
}

Here, some-process has not computed the classification "X" itself, but the assignment of the classification to the subject a (which is, in our cases, commonly a text which has been analysed). When linking "X" as x, this error becomes apparent:

[
  {
    "@id": "a",
    "classification": {"@id": "x"}
  },
  {
    "@id": "x",
    "@type": "Classification",
    "code": "X", "generatedBy": {"@id": "some-process"}
  }
]

The above says that the classification itself has been generated, which is wrong. (The classification x can be assigned to many subjects, some done by cataloguers, some by automatic processes, or a guided combination thereof.)

Its assignment to a is the intended meaning:

[
  {
    "@id": "a",
    "classification": {"@id": "x", "@annotation": {"generatedBy": {"@id": "some-process"}}}
  },
  {
    "@id": "x",
    "@type": "Classification",
    "code": "X"
  }
]

I believe this proposal would make it possible to take compact JSON-LD in the original example and make it equivalent to this:

{
  "@id": "a",
  "classification": {
    "@type": "Classification",
    "code": "X",
    "@annotation": {"generatedBy": {"@id": "some-process"}}
  }
}

through a context along the lines of:

{
  "@context": {
    "@vocab": "http://example.org/vocab/",
    "generatedBy": {
      "@annotation": "http://example.org/vocab/generatedBy",
      "@type": "@id"
    }
  }
}

which would then "survive" the subsequent linking.

I can imagine some variants of the exact form here. Perhaps "@nest" : "@annotation" would be more clear? (I can't really see the case where you'd want to use @nest too in the first compact form (as that seems to imply someone trying to separate the generatedBy structurally). I notice @container won't work, since just as I use @type coercion to @id in this example, someone might want to turn that into an array by default as well.)

Until JSON-LD-star becomes a standard, we can "get by" by ignoring the generatedBy keyword via the "generatedBy": null construct in our context.

Another interesting example we are exploring is for representing diffs using RDF-star. See this diff example in my LD visualization tool. Use the controls in the bottom right corner to view the data using different syntaxes (and note that the data is editable, so you can e.g. paste the examples above into the textarea). As seen in the JSON-LD format, it becomes rather repetitive to use @annotation for the addedIn and removedIn keys. The feature in this proposal would allow for this to become much more compact.

pchampin commented 1 year ago

@niklasl great use-cases. Yes, that's exactly what this proposal is trying to solve.

I am not too sure, though, about the "@nest": "@annotation" variant syntax. I think it would mean the opposite of what we are trying to convey here:

niklasl commented 1 year ago

@pchampin You're absolutely right, @nest is quite the opposite, structurally.

I think what slightly bothers me is the requirement of using either @annotation or @id exclusively in a term definition. But (as you note in the proposal) this is already the case since 1.0 with @reverse, so the rule is already established.

(Were we to design it all from scratch, I might suggest using @id for all these cases, and allow an array to represent a "term path" for @reverse and @annotation respectively; e.g. "subClass": {"@id": ["@reverse", "isSubClassOf"]} and "generatedBy": {"@id": ["@annotation", "/vocab/generatedBy"]}. But at this stage, I'm not sure if normalizing this and keeping the old pattern for compatibility is an improvement or just a complication. It would open up for regular property paths as well, which is a sometimes asked for but complex addition.)