Section B.3 of microdata has no example of original content reference

ppKrauss commented 2 years ago

An important case is to express the semantics of the original HTML content (by JSON-LD). To be didactic, the example must to show that it is possible to cite (by intralink reference) the original content by its ID — avoiding to rewrite all HTML content into a set of separated JSON values. It is important, a good practice, to avoid redundant and unsafe content copies.

We can illustrate adding an ID in the Example 164, something as

...
 <dt>Title</dt>
 <dd><cite id="tit01" itemprop="http://purl.org/dc/elements/1.1/title">Just a Geek</cite></dd>
...

So, Example 165 will be something as

[
  {
    "@id": "http://purl.oreilly.com/works/45U8QJGZSQKDH8N",
    "@type": "http://purl.org/vocab/frbr/core#Work",
    "http://purl.org/dc/elements/1.1/title": {"@id":"#tit01"},
    "http://purl.org/dc/elements/1.1/creator": "Wil Wheaton",
    "http://purl.org/vocab/frbr/core#realization": [...]
  }, ...
]

Notes:

a typical JSON-LD will be embedded in the HTML, in a <script type="application/ld+json"> block.
the "Web Semantic interpreter" tool will accept this kind of relative intralink reference... The example is important because it shows that the tools must interpret intralinks (optionally doing internal text-content convertion).

pchampin commented 2 years ago

You are right: JSON-LD embedded in a data block requires some duplication of information, while Microdata avoids that redundancy.

However, your JSON-LD snippet is not equivalent to the original example (nor to the corresponding Microdata, AFAICT). { "@id": "#tit01" } and { "@value": "Just a Geek" } (or its shortcut form "Just a Geek") are semantically distinct. So it can not be advertised as a drop-in replacement of the current example.

ppKrauss commented 2 years ago

Hi @pchampin, I am supposing DOM, something like the Javascript's document.getElementById("tit01").innerText property (sometimes equivalent to nodeValue)... Why you say that it "is not equivalent to the original example"?

Is there any way to express it? It seems that you suggesting something more complex, something like

[
  {
    "@id": "http://purl.oreilly.com/works/45U8QJGZSQKDH8N",
    "@type": "http://purl.org/vocab/frbr/core#Work",
    "http://purl.org/dc/elements/1.1/title": {
         "@value": {"@id":"#tit01"},
         "@type": "http://www.w3.org/2001/XMLSchema#string"
    },
    "http://purl.org/dc/elements/1.1/creator": "Wil Wheaton",
    "http://purl.org/vocab/frbr/core#realization": [...]
  }, ...
]

... But I also don't understand if in the @value context the intralink loses its meaning (a pre-parser replacing intralink by node), or the "cast to string" use innerText's method... Some explanation at json-ld-syntax spec?

pchampin commented 2 years ago

It is not clear to me whether, in this issue, you are suggesting to improve section B.3 w.r.t. to the current state of the spec, or if you are suggesting to change the spec to support a better alignment with microdata... My answer above was assuming the first option (no change in the spec).

Why you say that it "is not equivalent to the original example"?

The two following JSON-LD snippets generate different RDF triples. That's what I meant by "not equivalent":

    "title": { "@id": "#tit01" }

vs.

   "title": "Just a Geek"

I am not denying that some pre- or post-processing could rebuild the 2nd one from the 1st one, but this is not standard JSON-LD processing.

It seems that you suggesting something more complex, something like (...)

No, that's not what I was suggesting, mainly because this is invalid JSON-LD (the value of @value can not be an object).

gkellogg commented 2 years ago

JSON-LD is it's own representation, and is not designed to interoperate with either Microdata or RDFa, other than to produce RDF triples, which may be merged with each other. I don't see these specifications doing anything specific to pull data out of the DOM, although some other system could be constructed to treat some contents within a JSON-LD block as matching elements from the DOM through some dynamic programming. But, that's outside the bounds of the JSON-LD specifications themselves.

ppKrauss commented 2 years ago

Thanks @gkellogg, sorry for my wrong interpretation, I will close. Final questions:

About tools: seems that any "Web Semantic interpreter tool" working with Embedding JSON-LD in HTML Documents is free (?) to "matching elements from the DOM"... But it is not a standard procedure, and it is out of scope in the JSON-LD Specification version 1.1.
Can I submit a new issue here, with a change suggestion, for future, like version 1.2?
To expand support of embedding in HTML and to a better alignment with HTML+RDFa and Microdata. When embedding, some transformations can be used to replace intralinks without changing its meaning as Linked Data (and preserving HTML+RDFa meaning).

gkellogg commented 2 years ago

You might check on other fora, such Stack Overflow for any prior art in using the DOM to supplement JSON-LD. Of course, one way to do it would be to parse all the JSON-LD and Microdata/RDFa out of a web page, merge into a single graph, and then serialize that back out as JSON-LD, which is in perfect keeping with all the various standards, but this does involve using Microdata and/or RDFa markup in addition to the JSON-LD. As long as the identifiers for the various entities align and can be reconciled (i.e., they're not blank nodes), this should work just fine.

Of course, feel free to add issues suggesting features for a future version of JSON-LD. But, I don't believe that intermingling the different formats belongs in JSON-LD Syntax or API, but rather in some other document that depends on the component specifications. The JSON-LD CG would be the best place to foster such work, IMO.

ppKrauss commented 2 years ago

Thanks @pchampin and @gkellogg.

PS: sending to JSON-LD CG, as suggested.

w3c / json-ld-syntax

Section B.3 of microdata has no example of original content reference #382