linkeddata / rdflib.js

Linked Data API for JavaScript
http://linkeddata.github.io/rdflib.js/doc/
Other
562 stars 142 forks source link

rdf:Lists should be parsed into RDF triples #169

Open HolgerKnublauch opened 7 years ago

HolgerKnublauch commented 7 years ago

I noticed that parsing RDF lists (with Turtle) produces a Collection node, which then does not comply to standard RDF when queried via statementsMatching. I believe there should be at least a mode in which rdf:Lists are treated according to standard rdf:first/rest triples. Collections could still remain as a convenience layer, produced on the fly.

retog commented 6 years ago

Just noticed that the problem isn't specific to turtle.

the RDF/XML

<?xml version="1.0"?>
<rdf:RDF xmlns:ex="http://ex.org/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
   <rdf:Description rdf:about="http://www.w3.org/">
     <ex:members rdf:parseType="Collection">
        <ex:Member rdf:about="#SweetFruit" />
        <ex:Member rdf:about="#NonSweetFruit" />
     </ex:members>
   </rdf:Description>
</rdf:RDF>

should evaluate to 7 triples (try https://www.w3.org/RDF/Validator/) , but when parsed with rdflib it yields to only 3 triples. The result of graph.toNT() is:

{<http://www.w3.org/> <http://ex.org/members> _:0 .
<http://base.org/#SweetFruit> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ex.org/Member> .
<http://base.org/#NonSweetFruit> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ex.org/Member> .}
timbl commented 6 years ago

Yes, the way rdflib has always worked is to have lists as first class objects which you can iterate over, which has advantages from the programming language and API point of view.

Two possible solutions

timbl commented 6 years ago

Why does it have list objects? Because RDF is a among other things a language like Python or JS or JSON, and to not have arrays as first class types in the language is crazy. (Unless you are used to a lisp machine with CAR and CDR pairs I supose). From a developer’s point of view you need to be able to treat the thing in th enormal way in you language

 var kids = g.the(me, ns.foo(’children’)).elements
console.log(’They have ‘ + kids.length + ‘ kids')
kids.forEach( kid => console.log(kid))

(It would be even more natural if instead of Collection having a JS property ‘elements’ which is an Array, if it inherited from Array. )

retog commented 6 years ago

an easier and clearer name than .reifyCollections() would be .toRDF() which converts the rdflibjs-datammodel (with list as first class objects) to RDF (where lists are described with triples).

retog commented 6 years ago

A workaround for the missing toRDF() method is to serialize as n-triples (not using toNT() which just omits the information that it is a list) and parse again.

let rdfXml = '<?xml version="1.0"?>\n\
<rdf:RDF xmlns:ex="http://ex.org/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">\n\
   <rdf:Description rdf:about="http://www.w3.org/">\n\
     <ex:members rdf:parseType="Collection">\n\
        <ex:Member rdf:about="#SweetFruit" />\n\
        <ex:Member rdf:about="#NonSweetFruit" />\n\
     </ex:members>\n\
   </rdf:Description>\n\
</rdf:RDF>'
let graph = $rdf.graph();
$rdf.parse(rdfXml, graph, 'http://base.org/', 'application/rdf+xml')
console.log(graph.length); // => 3
let nTriples = $rdf.serialize(undefined, graph, 'http://base.org/', 'application/n-triples');
let graph2 = $rdf.graph();
//rdflibjs doesn't know how to parse application/n-triples yet
$rdf.parse(nTriples, graph2, 'http://base.org/', 'text/turtle')
console.log(graph2.length); // => 7

(didn't get the first argument of the serialize-method, but with undefined it seems to work)

rescribet commented 6 years ago

I just noticed that, contrary to n3 and turtle, the n-quads parser doesn't seem to convert the triples to a collection (couldn't test triples, but I suspect it's the same). The use case is to show a select input from a shacl:in list property.

rescribet commented 4 years ago

Apart from breaking compatibility with RDF, this also creates incompatibility with the RDFJS task force spec by introducing a new value for termType (this makes cross-project typescript a lot harder)

Implementing rdf modifications by using a special datatype (e.g. rdflib:inlineArray) might create a good balance between not breaking with external specifications while still being able to keep convenience features for developers.

interface ArrayLiteral extends Literal {
  termType: "Literal";
  datatype: "<rdflib:inlineArray>";  // Discriminated union, allows setting it apart from other literals
  elements: Array<NamedNode | BlankNode | Literal>; // Dev ex
  value: string; // probably some getter which joins the elements to the turtle syntax 
}

This would require the parsers to check for the datatype field rather than termType, but that seems like a minor change