Closed jacoscaz closed 7 months ago
Data Islands very much are a W3C REC. Not only that, they represent the de-facto semantic web in 2024, via schema.org
https://www.w3.org/TR/json-ld11/#embedding-json-ld-in-html-documents
I have a stub of a similar library, getj here:
Demo:
IMHO RDFa (and XHTML) are technical debt that hold back projects that need to support these old, less popular, formats. A good example being Solid. RDFa holds it back, developers dont want to join, and those that joined before walked away, because modern web devs want to use JSON.
A good example being Solid. RDFa holds it back, developers dont want to join, and those that joined before walked away, because modern web devs want to use JSON.
Do you have data to back up this claim or is this just your opinion?
If it helps any, I (heavily involved in RDFa WG, and RDFa API author) ripped out RDFa from many pages (100 million +) and moved our setups to json-ld in data islands (billions of pages).
For interest they all.utilize the data islands as data in js also.
for (const match of html_string.matchAll(/<script[^>]?type="application\/ld+json"[^>]?>(.*?)<\/script>/sig)) { console.log(match[1]); }
globalThis.di = Array.from(document.querySelectorAll('[type="application/ld+json"]')).map(function(island){ return [island.id, JSON.parse(island.text)]}).reduce(function(obj, item) {
obj[item[0]] = item[1]
return obj
}, {});
A good example being Solid. RDFa holds it back, developers dont want to join, and those that joined before walked away, because modern web devs want to use JSON.
Do you have data to back up this claim or is this just your opinion?
A bit of both. I founded the Solid Community Group and am in touch with many people there, and before it. I also have traffic statistics from reddit. I created the biggest and most popular Solid Pod, and ran it for 1/4 of a decade until I got sick. I also follow the github interest in solid. While the project is extremely well funded, developer interest has waned from its peak. RDFa is hard to work with, and web developers like JSON. RDFa is also enormously buggy. Compare the triples on your own webid, in the RDFa, and that of the turtle. They are not the same, last I checked. I'm sure it will all get fixed eventually given the long runway that Solid has, but working with JSON allows other projects in the open (social) web, to progress enormously fast. I helped on board 1000s of developers onto the open (social) web, and JSON is one of the big sellers. People will look at Solid and say "interesting" but then go and work on a JSON project.
Try this:
npx getj <uri_with_data_island>
for example
npx getj https://spux.org/getj/test.html
gives
{
"@context": "http://schema.org",
"@type": "WebPage",
"url": "https://example.com",
"name": "Example Web Page"
}
If there's interest I can donate this npm library to the CG and we can collaborate on a function that will extract data islands from command line, browser, or server
@webr3 @melvincarvalho both of your implementations rely on a full-blown DOM/HTML5 parser, though, as provided by either the browser or by dependencies. Ugly and hack-ish as it is, my code doesn't rely on anything but the obvious JSON-LD parser one would need anyway.
IMHO, compared to RDFa, which has its own media type and doesn't force a client to rely on heuristics, Data Islands (or Blocks, according to the JSON-LD spec) make sense only if they allow devs to dispense with the complexity of parsing HTML5 or, worse, of an in-memory DOM representation. Otherwise one would already be 80% there to RDFa support.
Ugly and hack-ish as it is, my code doesn't rely on anything but the obvious JSON-LD parser one would need anyway.
Mine was indeed an ugly hack too. But we could make a half decent library if we work together, I suspect.
Would anyone object to converting this issue into a discussion?
/chair hat off
Hi everyone. Often, particularly when it comes to formats, the discussion touches upon whether RDF data islands can be a valid alternative to RDFa for picking a format readable by both humans and machines alike. Let's leave aside, for a moment, the fact that data islands are not a W3C REC and let's focus on the technical side of this issue.
Now, an obligatory disclaimer: nothing in this issue is an attempt at forcing such a format upon the WebID Spec, whatever form that takes. I am, however, interested in your opinion as to the pros and cons of each.
In my humble opinion, data islands are, indeed, much friendlier than RDFa but only insofar as they can be parsed out of HTML without a full-blown DOM/HTML5 parser. To that end, the following code demonstrates a way to do so:
Granted, the above is a crude, inefficient quick hack and it is incapable of supporting edge cases such as a data island that contains a
</script>
within a JSON-LD string literal. Nonetheless, at least in my case, the above would be more than enough functionally to consider using JSON-LD data islands rather than RDFa.I think a state machine could be made that would be capable of quickly getting to data islands while discarding everything else and still be orders of magnitude less complex than full DOM/HTML parsing.
Thoughts?