digitalbazaar / jsonld.js

A JSON-LD Processor and API implementation in JavaScript
https://json-ld.org/
Other
1.66k stars 197 forks source link

RDFJS compliance: blank node values should not be prepended by `_:` #244

Closed pietercolpaert closed 4 years ago

pietercolpaert commented 6 years ago

RDFJS defines in its specification that:

value blank node name as a string, without any serialization specific prefixes, e.g. when parsing, if the data was sourced from Turtle, remove "_:", if it was sourced from RDF/XML, do not change the blank node name (example: "blank3")

However, jsonld.js does not remove _: when executing the toRDF code. This can be found in multiple places. One example: https://github.com/digitalbazaar/jsonld.js/blob/c6501fbf44cd9d5a589212f9428c0573ae40ada2/lib/toRdf.js#L103

dlongley commented 4 years ago

Closing as not a bug. If there's an actual bug here, please re-open and provide a test case with expected output and actual output that does not match.

joeltg commented 4 years ago

I also care about this issue! Right now, anybody trying to integrate jsonld.js with any other popular RDF library like n3.js has to manually run some snippet like this on the output of toRDF:

const dataset = await jsonld.toRDF(doc, {})
for (const quad of dataset) {
    if (quad.subject.value.startsWith("_:")) {
        quad.subject.value = quad.subject.value.slice(2)
    }
    if (quad.object.value.startsWith("_:")) {
        quad.object.value = quad.object.value.slice(2)
    }
    if (quad.graph.value.startsWith("_:")) {
        quad.graph.value = quad.graph.value.slice(2)
    }
}
// *now* we can import `dataset` into an n3.js store...

... which is really annoying!

A concrete test case would be:

jsonld.toRDF({ "http://schema.org/knows": { "@id": "_:b1" } }, {})

which produces

[
  {
    subject: { termType: 'BlankNode', value: '_:b0' },
    predicate: { termType: 'NamedNode', value: 'http://schema.org/knows' },
    object: { termType: 'BlankNode', value: '_:b1' },
    graph: { termType: 'DefaultGraph', value: '' }
  }
]

but should produce

[
  {
    subject: { termType: 'BlankNode', value: 'b0' },
    predicate: { termType: 'NamedNode', value: 'http://schema.org/knows' },
    object: { termType: 'BlankNode', value: 'b1' },
    graph: { termType: 'DefaultGraph', value: '' }
  }
]

The relevant portion of the RDFJS spec was quoted in the original comment - the jsonld.js documentation doesn't explicitly mention RDFJS anywhere as far as I can tell, but it's clearly the intended representation.

Edit: I should add that I'd be happy to open a pull request fixing this in lib/toRdf.js if you're interested, although I don't understand the rest of the codebase well enough to know if anything else depends on the current behavior

tpluscode commented 8 months ago

@pietercolpaert would you mind reopening this? This is in fact a problem which bit me more than once

pietercolpaert commented 8 months ago

I cannot re-open this very issue - please check with @dlongley who indicated a couple of years ago a new issue with a proposal for a PR with a test case will do the trick.

Today I just use https://github.com/rubensworks/jsonld-streaming-parser.js instead

tpluscode commented 8 months ago

So do I, mostly. But for a quick hack, jsonld.promises.toRDF is tempting given its simplicity. And then _:_: happens :)

dlongley commented 8 months ago

PR #548 might address this, I'm not sure.

davidlehn commented 8 months ago

The rdf-canonize internals were changed to hopefully address this issue around here: https://github.com/digitalbazaar/rdf-canonize/commit/b36b1c9f1b6d2d2a050fb6f33898493999071b66. Fixing and updating jsonld.js to use the newer rdf-canonize has dragged on longer than expected (we're busy). Hopefully when that is merged it will fix the issue. That PR could certainly use some testing if you have the time.

Also I don't think jsonld.promises is needed anymore? The main calls should be async.