The current util.from_n3() method has a couple of bugs:
- multiline strings aren't parsed correctly:
>>> rdflib.util.from_n3('''"""multi
... line
... string"""@en''')
rdflib.term.Literal(u'""multi\nline\nstring""', lang='en')
should be:
rdflib.term.Literal(u'multi\nline\nstring', lang='en')
- datatype is truncated
>>> rdflib.util.from_n3('"foo"^^xsd:string')
rdflib.term.Literal(u'foo', datatype=rdflib.term.URIRef('sd:strin'))
- numbers and booleans aren't parsed correctly:
>>> rdflib.util.from_n3('42')
rdflib.term.BNode('42')
>>> rdflib.util.from_n3('true')
rdflib.term.BNode('true')
>>> rdflib.util.from_n3('false')
rdflib.term.BNode('false')
- given the invalid n3 syntax: '"foo"@en^^xsd:something' the language tag is
chosen over the datatype (this is the opposite of the existing notation3 parser)
Attached is a mercurial patch which fixes these issues and also adds a couple
of test cases to help testing.
The test cases now also include a simplistic wrapper which actually invokes the
existing parsers.notation3 to make sure both result in similar rdflib terms.
Original issue reported on code.google.com by joernhees2 on 25 Jan 2012 at 7:07
Original issue reported on code.google.com by
joernhees2
on 25 Jan 2012 at 7:07Attachments: