drobilla / serd

A lightweight C library for RDF syntax
https://gitlab.com/drobilla/serd
ISC License
86 stars 15 forks source link

Unclear semantics of Serdi flag `-i` #9

Closed wouterbeek closed 6 years ago

wouterbeek commented 6 years ago

I originally understood the semantics of the Serdi -i flag to be that the specified grammar must be used (and that -- as a consequence -- every violation of that grammar would throw a warning). However, it seems possible to parse some Turtle files that are not N-Triples using the N-Triples grammar, without emitting a warning. For example, the following

$ serdi -i ntriples test.ttl

parses the following data file using the Turtle (and not the indicated N-Triples) grammar:

@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Johnny Lee Outlaw" .
_:a  foaf:mbox   <mailto:jlow@example.com> .
_:b  foaf:name   "Peter Goodguy" .
_:b  foaf:mbox   <mailto:peter@example.org> .
_:c  foaf:mbox   <mailto:carol@example.org> .
drobilla commented 6 years ago

True. The reader is really a Turtle reader that happens to also read ntriples. I could probably implement a new top level document parser but it has never proven useful for me to have it fail to parse a document it could successfully parse, and it fully passes both test suites' negative examples.

wouterbeek commented 6 years ago

@drobilla I agree with your points. The problem remains that the semantics of i ntriples is unclear.

As you indicate, i ntriples does not mean that an N-Triples parser is used in order to parse the input data. (Indeed, [1] is parsed without warnings, even though it is not N-Triples.)

[1] prefix : <x:> :a :b :c
[2] prefix : <x:x> :a :b 1

-i ntriples seems to mean that the input file is sometimes parsed as Turtle (example [1]) and is sometimes parsed as N-Triples (example [2]). This means that -i ntriples will prevent some Turtle files that are not N-Triples files from being parsed without warnings. But what is the value of such a flag?

drobilla commented 6 years ago

It rejects many things not allowed in NTriples, in particular fancy literals, abbreviation, and so on. The NTriples negative tests will not pass if you parse as Turtle. It's just not absolutely strict, and I have not thoroughly audited the code to handle cases that the official test suite does not include.

That said, for this file, the latest git version does this:

$ ./build/serdi_static -i ntriples ./test.ttl
error: ./test.ttl:1:1: syntax does not support directives
wouterbeek commented 6 years ago

@drobilla Thanks for the clarifications! I'm closing this issue now.