SWI-Prolog / packages-semweb

The SWI-Prolog RDF store
29 stars 15 forks source link

How does the RDF/XML parser deal with buggy data? #53

Closed wouterbeek closed 7 years ago

wouterbeek commented 7 years ago

The following RDF/XML file contains a bug: it does not define the namespace of the predicate term.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xml:base="https://example.org/">
  <rdf:Description rdf:about=''>
    <rdfs:seeAlso rdf:resource="https://example.org/info" />
  </rdf:Description>
</rdf:RDF>

Loading this file with Semweb gives no warning/exception but results in an incorrect predicate term:

?- [library(semweb/rdf11)].
?- rdf_load('test.rdf').
% Loaded "test.rdf" in 0.00 sec; 1 triples
?- rdf(S, P, O).
S = 'https://example.org/',
P = rdfsseeAlso,
O = 'https://example.org/info'

I was expecting the parser to (1) emit a warning and (2) exclude this particular triple from the result set. There is also an options max_warnings, but setting this to -1 or 0 gives the same result.

JanWielemaker commented 7 years ago
For built-in help, use ?- help(Topic). or ?- apropos(Word).

6 ?- [library(semweb/rdf11)].
true.

7 ?- rdf_load('test.rdf').
ERROR: SGML2PL(xmlns): file:///ufs/wielemak/Bugs/RDF/test.rdf:4: namespace "rdfs" does not exist
% Parsed "test.rdf" in 0.00 sec; 1 triples
true.

Could it be you have stuff loaded that intercepts messages?

wouterbeek commented 7 years ago

I don't understand what's going on here:

$ swipl
?- [library(semweb/rdf_db)].
?- rdf_load('test.rdf').
ERROR: SGML2PL(xmlns): file:///home/wbeek/test.rdf:4: namespace "rdfs" does not exist
% Parsed "test.rdf" in 0.00 sec; 1 triples
?- halt.
wbeek@laptop:~$ swipl
?- [library(semweb/rdf_db)].
?- rdf_load('test.rdf').
% Loaded "test.rdf" in 0.00 sec; 1 triples

Notice that now I get the exception the first time I load the file but not the second time (after restarting swipl). I do not have anything fancy / messages hooks defined.

JanWielemaker commented 7 years ago

Fails to reproduce. Tried about 10 times. Ran under valgrind, which doesn't report any access to uninitialized memory (the common cause for which issues).