kuzudb / kuzu

Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
https://kuzudb.com/
MIT License
1.29k stars 90 forks source link

Correct support of @base, @prefix, BASE, PREFIX directives in Turtle files (about the `.` at the end of these lines) #2789

Open semihsalihoglu-uw opened 7 months ago

semihsalihoglu-uw commented 7 months ago

According to Turtle specification, the BASE and PREFIX directives, unlike @base and @prefix directives should not have a . at the end. This is the NOTE in Section 2.4 of the spec: "NOTE The '@prefix' and '@base' directives require a trailing '.' after the IRI, the equalivent 'PREFIX' and 'BASE' must not have a trailing '.' after the IRI part of the directive."

Consider the following example:

BASE <http://example.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> 
@prefix rel: <http://www.perceive.net/schemas/relationship/> .

<#green-goblin>
    rel:enemyOf <#spiderman> ;
    a foaf:Person ;    # in the context of the Marvel universe
    foaf:name "Green Goblin" .

<#spiderman>
    rel:enemyOf <#green-goblin> ;
    rdf:type foaf:Person ;
    foaf:name "Spiderman", "Человек-паук"@ru .

Above BASE is valid, @prefix rdfs and rel are valid. But PREFIX rdf and @prefix foaf are invalid. Currently we accept all of these. Let's comply with the specification. My suggestion is to error with a good error message and stop parsing and unroll.

andyfengHKU commented 7 months ago

As we discussed, we won't be able to throw exceptions only for prefix and base.

A long term solution is to give a strict parsing mode where we throw RuntimeException for malformed lines.