digitalbazaar / jsonld.js

A JSON-LD Processor and API implementation in JavaScript
https://json-ld.org/
Other
1.64k stars 195 forks source link

Handle blank nodes without blank node prefix properly #515

Open MarcusElevait opened 1 year ago

MarcusElevait commented 1 year ago

When having quads with blank nodes like this:

[
  {
    "termType": "Quad",
    "subject": {
      "termType": "BlankNode",
      "value": "n3-5"
    },
    "predicate": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#first"
    },
    "object": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
    },
    "graph": {
      "termType": "DefaultGraph",
      "value": ""
    }
  },
  {
    "termType": "Quad",
    "subject": {
      "termType": "BlankNode",
      "value": "n3-5"
    },
    "predicate": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#rest"
    },
    "object": {
      "termType": "BlankNode",
      "value": "n3-6"
    },
    "graph": {
      "termType": "DefaultGraph",
      "value": ""
    }
  },
  {
    "termType": "Quad",
    "subject": {
      "termType": "BlankNode",
      "value": "n3-6"
    },
    "predicate": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#first"
    },
    "object": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type1"
    },
    "graph": {
      "termType": "DefaultGraph",
      "value": ""
    }
  },
  {
    "termType": "Quad",
    "subject": {
      "termType": "BlankNode",
      "value": "n3-6"
    },
    "predicate": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#rest"
    },
    "object": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"
    },
    "graph": {
      "termType": "DefaultGraph",
      "value": ""
    }
  },
  {
    "termType": "Quad",
    "subject": {
      "termType": "NamedNode",
      "value": "http://example.com/ns#PersonShape"
    },
    "predicate": {
      "termType": "NamedNode",
      "value": "http://www.w3.org/ns/shacl#ignoredProperties"
    },
    "object": {
      "termType": "BlankNode",
      "value": "n3-5"
    },
    "graph": {
      "termType": "DefaultGraph",
      "value": ""
    }
  }
]

where the value of the blank node is not prefixed with a blank node prefix, i would expect the library to parse to a proper json array like this:

{
    "http://www.w3.org/ns/shacl#ignoredProperties": [
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#type1"
    ]
}

when using the fromRDF function. But what we actually get is this:

[
  {
    "@id": "http://example.com/ns#PersonShape",
    "http://www.w3.org/ns/shacl#ignoredProperties": [
      {
        "@id": "n3-21"
      }
    ]
  },
  {
    "@id": "n3-21",
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#first": [
      {
        "@id": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
      }
    ],
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#rest": [
      {
        "@list": [
          {
            "@id": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type1"
          }
        ]
      }
    ]
  }
]

As soon as i add the prefix '_:' to all blank node values, i get the proper json array output. The rdfjs specification points out, that there should be no prefix on the blank node value. And when using a parser to e.g. parse a turtle serialized RDF to quads, they correctly remove the prefix from the value. But then it's not compatible with this library here.

So my question is, is the goal of this library here to stay in line with the rdfjs specification? Because that's important to know for us, when deciding if we can use this library here in the future or not.

davidlehn commented 1 year ago

I'm not sure of the history behind why the behavior is the way it is. It's quite possible code here predates any formal RDF/JS spec on the matter? Or perhaps it's just an oversight. If I had to guess, there are few tools other than rdf-canonize that depend on the dataset format here. (The JSON-LD tests are based on N-Quads.) As long as it's internally consistent, it's probably ok. If this was changed, I'm not sure what backwards compatibility is needed, if any. I'm not against changing this, just need to figure out some details.

MarcusElevait commented 1 year ago

I'm not pretty sure if i understand your answer correctly. I mean the basic question is, if your library here is staying consistent with the rdfjs spec in the future or not? If yes, then probably the prefix thing i mentioned in my original post should be fixed. If no, then we have to maybe implement our own solution.