Open GoogleCodeExporter opened 8 years ago
it seems that the annotation 'en-US' is causing the problem, perhaps the '-' is
the tricky part :]. Cosmin
Original comment by cosmin.b...@gmail.com
on 16 Jul 2011 at 7:44
I am confused - rdflib does not support nquads :)
But benosteen seems to have written a parser:
https://github.com/benosteen/RDFLib-NQuads-parser/blob/master/rdflib_nquads.py
merging this into rdflib would probably make sense.
Original comment by gromgull
on 19 Aug 2011 at 12:21
I added benosteen's nquad parser in an nquads branch, and also a serializer.
http://code.google.com/p/rdflib/source/browse/?name=nquads
Now I didn't actually check if this fixes your issue - will do soon.
Original comment by gromgull
on 20 Aug 2011 at 7:50
It doesn't fix the issue but, by my reading of the RDF runes, RDFLib is correct
(pace a less-than-useful error message) in only permitting lowercase language
tags because uppercase would seem to be contraindicated ...
The W3C's RDF testcases document
(http://www.w3.org/TR/rdf-testcases/#langString) presents this BNF spec, to
which the RDFLib ntriples parser currently conforms, and some clarifying rubric:
"""
literal ::= langString | datatypeString
langString ::= '"' string '"' ( '@' language )?
datatypeString ::= '"' string '"' '^^' uriref
language ::= [a-z]+ ('-' [a-z0-9]+ )*
encoding a language tag.
...
optionally a language tag as defined by [RFC-3066], normalized to lowercase.
Note: The case normalization of language tags is part of the description of the
abstract syntax, and consequently the abstract behaviour of RDF applications.
It does not constrain an RDF implementation to actually normalize the case.
Crucially, the result of comparing two language tags should not be sensitive to
the case of the original input.
"""
To have the quad successfully parsed by the current code, the OP should simply
lowercase the second language tag:
"square kilometer"@en-us
For RDFLib to accept uppercase language subtags, this small change to the
ntriples.py "litinfo" regex would be needed ...
-litinfo = r'(?:@([a-z]+(?:-[a-z0-9]+)*)|\^\^' + uriref + r')?'
+litinfo = r'(?:@([a-z]+(?:-[A-Za-z0-9]+)*)|\^\^' + uriref + r')?'
I'm unsure what the ramifications of this latter change would be for any
existing RDFLib-provided support for language tag comparisons.
Original comment by gjhigg...@gmail.com
on 24 Oct 2011 at 3:26
Original issue reported on code.google.com by
cosmin.b...@gmail.com
on 16 Jul 2011 at 7:43