SWI-Prolog / packages-semweb

The SWI-Prolog RDF store
29 stars 15 forks source link

RDF 1.1, SPARQL and SWI7 literal handling #12

Closed JanWielemaker closed 9 years ago

JanWielemaker commented 9 years ago

Status

Literal handling in rdf_db is outdated and not practical. Currently, literals have the form

All literals are sorted, but the sorting has little relation to the SPARQL defined ordering. This implies we cannot optimize SPARQL queries that use comparison operators.

Issues

By type:

In retrospect, I think the value of an XMLLiteral should be the (canonical) XML serialization rather than the DOM tree. Other types:

wouterbeek commented 9 years ago

Great proposal! Improving literal assertion and named graph retrieval will make Semweb more versatile as well as more compliant with RDF 1.1. Some comments:

Storing literals using values

In Semweb the following currently asserts exactly one triple, disregarding the datatype entirely:

?- rdf_assert(rdf:s, rdf:p, literal(1)).
?- rdf_assert(rdf:s, rdf:p, literal(type(xsd:integer,1))).
?- rdf_assert(rdf:s, rdf:p, literal(type(xsd:nonNegativeInteger,1))).

Mixing lexical forms and values

Currently Semweb does not correctly allow some literals to be asserted using lexical expressions and some literals to be asserted using values.

The following should assert exactly one triple, since lexical expression '1' maps to value 1 according to the lexical-to-value mapping of datatype XSD integer.

?- rdf_assert(rdf:s, rdf:p, literal(type(xsd:integer,1))).
?- rdf_assert(rdf:s, rdf:p, literal(type(xsd:integer,'1'))).

Result:

?- rdf(S, P, O).
S = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#s',
P = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#p',
O = literal(type('http://www.w3.org/2001/XMLSchema#integer', 1)) ;
S = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#s',
P = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#p',
O = literal(type('http://www.w3.org/2001/XMLSchema#integer', '1')).

Simple / plain literals

The terms "simple literal" and "plain literal" have both been dropped in RDF 1.1. The former are now XSD strings. Some of the latter are now XSD strings and some of the latter are now RDF language-tagged strings.

wouterbeek commented 9 years ago

I tried to incorporate these point into the wiki: https://github.com/SWI-Prolog/packages-semweb/wiki/Proposal-for-Semweb-library-redesign

@JanWielemaker Can you check whether all your points were sufficiently included and then close this issue?