Specify a node type when interpreting JSON as JSON-LD (was: use of "@type" in "@context")

ioggstream commented 2 years ago

Question

It's not clear to me if it's legitimate to use @type in @context instead of in the JSON object, eg.

"@context":
  "@vocab": "https://schema.org/",
  birthplace:
    "@type": "City"
birthplace:
  name: Roma

since the jsonld playground renders this as

_:c14n0 <https://schema.org/birthplace> _:c14n1 .
_:c14n1 <https://schema.org/name> "Roma" .

while I expect

_:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://schema.org/City> .
_:c14n0 <https://schema.org/name> "Roma" .
_:c14n1 <https://schema.org/birthplace> _:c14n0 .

Note

changing the payload to

"birthplace": "Roma"

results in

_:c14n0 <https://schema.org/birthplace> "Roma"^^<https://schema.org/City> .

meaning that in some way the "City" type is processed in some way.

pchampin commented 2 years ago

@type is a bit polysemic in JSON-LD, which can be confusing (you are not the first one).

@type in a node object adds an rdf:type arc to the RDF graph (it also has a few other effects on processing, e.g. by enabling type-scoped contexts)
@type in a value object specifies the datatype of the produced literal
@type in the context, more precisely in a term definition, specifies type coercion. It only applies when the value of the term is a string, hence the change of behaviour in your "Note" above.

Type coercion can not be used to automatically add an rdf:type arc to a graph. Its purpose is only to change the kind of node generated from a string. More precisely:

if @type is @id or @vocab, the string generates an IRI node,
otherwise, the string generates a literal with a specific datatype.

ioggstream commented 2 years ago

The use case is the following:

I need to provide a framework to create jsonld annotated payloads out of existing json APIs payloads. Currently I'm able to use @context for that, and the missing core feature is the one of attaching a class to an object.

Is it possible to register and publish an extension keyword to json-ld (eg. a @class keyword) to achieve that goal?

cc: @ccattuto

gkellogg commented 2 years ago

There is no such registration mechanism, although the API spec does have some processing considerations for things that look like keywords.

In the past, when people were trying to use @type in a JSON-LD context to coerce a node object to have an @type, we've pointed them to use JSON-LD Framing, which is specifically intended to allow the transformation of JSON-LD documents, including adding default values for things like @type.

ioggstream commented 2 years ago

we've pointed them to use JSON-LD Framing, which is specifically intended to allow the transformation of JSON-LD documents

I used Framing to transform RDF in plain json objects: I'll check if there's a way to use it for transforming json in RDF

ioggstream commented 2 years ago

@pchampin

Type coercion can not be used to automatically add an rdf:type arc to a graph Ok.

It seems this is a recurring issue:

31
2018 minutes
76
... and this is not supported because the spec does not want @context to add triples, right? It can be reasonable, but I'm still investigating. I think it's not a matter of security, since altering the context is still critical with or without rdf:type, and if there are integrity concerns, they should be addressed using digital signatures.

Still in my perspective, this makes quite complex adding metadata to existing APIs without impacting on the currently exchanged payloads (in my perspective, rdf:type is a metadata, but iiuc it can be considered actual data).

"@context":
  "@vocab": "https://schema.org/",
  birthplace:
    "@a": "http://schema.org/City"
birthplace:
  name: Roma

->

_:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://schema.org/City> .
_:c14n0 <https://schema.org/name> "Roma" .
_:c14n1 <https://schema.org/birthplace> _:c14n0 .

@gkellogg While I read the threads, it's not clear to me how framing can help, since:

the information source of Framing spec is a JSON-LD document (eg. a RDF file).
my goal is to interpret a JSON document to a JSON-LD doc, which is similar to #335 (even if coercion is probably not the right word).

cc: @giorgialodi @msporny @pietercolpaert

pchampin commented 2 years ago

this is not supported because the spec does not want @context to add triples, right?

Exactly. And this choice is not a matter of security, but more of bounding the complexity. JSON-LD does not aim to allow you to express any arbitrary JSON to RDF transformation.

in my perspective, rdf:type is a metadata

That's an interesting way to look at it... but the stance taken by JSON-LD is that it is just another arc.

Another possible way to solve your problem would be to rely on RDFS and some form of (light) inference. What you want to encode in your context looks very much like an rdfs:range in RDFS, more precisely:

schema:birthplace rdfs:range schema:City.

ioggstream commented 2 years ago

@pchampin thanks for your reply.

the stance taken by JSON-LD is that it is just another arc Ok.

How is it possible to express rdfs:range in @context without modifying the json payload?

As an example, this patched API editor can transform a JSON payload to RDF using the information provided by a @context: it just lacks type information (eventually from rdfs:range) to retrieve information from the sparql server.

pchampin commented 2 years ago

How is it possible to express rdfs:range in @context without modifying the json payload?

rdfs:range should go neither in the @context nor in the payload. It should be part of the definition of the vocabulary your JSON-LD context maps to. (ok, in the example above, this didn't quite work, as you can not add the rdfs:range triple in schema.org...).

To rephrase my proposal: you do not explicitly add the rdf:type arc in your graph, but you declare your vocabulary/ontology in such a way that the rdf:type can be inferred from the users of your data.

Mind you: I perfectly understand that this option may not work in all contexts (some users will not have the ability to perform any kind of inference, not even as light as RDFS). But I think it is worth putting in on the table.

ioggstream commented 2 years ago

@pchampin now it is clear :) Actually we provide something like that in the patched editor above.

In some cases, a given field can have multiple possible classes. Other times, we would like a schema designer to start from a rdf:type and identify properties starting from there.

iiuc this is something that's not going to be supported in json-ld.

Since a preliminary goal is to attach a semantic context for OAS3-described APIs I'll just limit to the information available via the context.

Next, I'll identify a way to describe in OAS3 documents the remaining information such as rdf:type.

Thanks again for your time Pierre-Antoine.

@gkellogg if this is a definitive wont-fix, please feel free to close.

cc: @giorgialodi @pietercolpaert @ccattuto

msporny commented 2 years ago

@ioggstream wrote:

iiuc this is something that's not going to be supported in json-ld.

Correct, however, what we recommend that people do to support the sort of use case you're describing is to write a pre-processor or post-processor. The feature you're describing is a fairly simple function written in most languages that just searches the document structure for specific keywords (either before or after JSON-LD processing) and injects new nodes into the graph. So, you could wait around for the JSON-LD WG to decide to implement, standardize, test, and publish a new standard... OR, you could just implement it in 10 lines of code or so. :)

My suggestion is you write some code to do this, point to the code, and perhaps others will find your code useful as a JSON-LD pre/post-processor.

ioggstream commented 2 years ago

write a pre-processor...10 LoC yep, but...

we're working on a national API catalogue, and we want to add the ability for API providers to attach semantic information to the existing json schema documents. The goal is to define a syntax allowing to describe all relevant information to interpret json as json-ld for existing APIs, so before writing code, the issue for me is how to describe the nodes to be added.

I wrote a little gdoc with the problem statement...

rob-metalinkage commented 2 years ago

Thanks @ioggstream for the doc. I'm also looking into this in the context of a community with a series of JSON schemas (often competing in scope due to different histories) and OpenAPI usage. I have looked at the requirements and logic here Domain model interoperability - a work in progress - this thread had been helpful in confirming my understanding of the scope of json-ld-framing at least - thanks,

I see lots of statements about "you will need a parser" etc - but I havent seen any articulation of the overall architecture being discussed - only bottom-up analyses of options and challenges. The default architecture of the web is client->server with some negotiation -> resources with links to other resources. I was excited when i found this.. https://www.w3.org/standards/webarch/principles ... but it didnt last long :-(

ioggstream commented 2 years ago

@rob-metalinkage

. I'm also looking into this in the context of a community with a series of JSON schemas

Interesting! Could you share more details about your goals (e.g. examples without sensitive information ...)?

I havent seen any articulation of the overall architecture being discussed

only bottom-up analyses of options and challenges

The idea of the document is to provide a syntax to support an architecture based on REST APIs. Since different API providers can have different information architectures, the goal was not to constrain them with a specific view (some people consider REST a significant constrain too :) ).

Domain model interoperability..

Thanks for the link, I added some content (mainly clarification stuff).

rob-metalinkage commented 2 years ago

For an overview of current state, which is a transition from XML based RPC style services to OpenAPI see https://ogcapi.ogc.org/ The reach goals we are exploring seem similar - to identify patterns for semantic annotation to allow such APIs to offer and deliver data that can be (at least partially) understood by clients or intermediary services, and hence found, accessed etc (FAIR principles).

namedgraph commented 6 days ago

Was looking into JSON-LD as a possible mapping/transformation mechanism but trivial issues like this make it clear that it's not powerful enough.

msporny commented 6 days ago

Was looking into JSON-LD as a possible mapping/transformation mechanism but trivial issues like this make it clear that it's not powerful enough.

JSON-LD can't be everything to everyone, as @pchampin said earlier in this thread, all languages make decisions to bound complexity.

Another way to put it is: JSON-LD is already complex enough w/o adding features like this to meet use cases that aren't (yet?) harming JSON-LDs adoption. On the contrary, the latest numbers from common crawl seem to indicate that the JSON-LD feature set is enough to see its usage across hundreds of millions of websites.

That said, I do think JSON-LD currently has features that add little value and, ideally, we'd remove those features. For example, property based indexing, node identifier indexing, and node type indexing have not seen significant usage. The feature being discussed in this thread could fall into the same category, but I can also see why it would be useful. The question is, will it be useful enough to define it? We all know how difficult to impossible it is to remove features from languages once they're there.

With JSON-LD 1.0, we had to guess at a fair amount of this "is it a useful feature" stuff (we didn't have usage data when we were doing v1.0), and in hindsight, it's now clear that we guessed wrong on a small subset of features (because they're not being used like we thought they were going to be used). That led to code/processor complexity that could be optimized out (if the group can muster the momentum to do so -- always a difficult ask of any group to CUT features that have been deployed). JSON-LD v2.0 should do this, though I expect I'm in the minority in suggesting that as a course of action for JSON-LD v2.0.

IOW, the way to convince the group that a new feature is required is to provide some code for the feature you want, get a community together that uses/depends on the code, and then use that as a driver for the usefulness of a feature.

We may want to take a polyfill approach to all future JSON-LD feature requests, and we might want to see if standardizing hooks into JSON-LD processors that enable @ioggstream to do what he's suggesting w/o impacting the core standard would lead to better outcomes. The process would be something like:

Polyfill a new JSON-LD feature (maybe via special processing of @ keywords) and demonstrate some community adoption of the feature.
Use that data to drive changes to the core spec.

That approach feels more sustainable than trying to convince the WG to add a feature when there are only a handful of examples of the feature being useful to a subset of the community.

w3c / json-ld-syntax

Specify a node type when interpreting JSON as JSON-LD (was: use of "@type" in "@context") #386

Question

Note

31

76