RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.18k stars 558 forks source link

N-Quads parser doesn't adhere to the N-Quads standard: '<>' #2354

Closed sdasda7777 closed 1 year ago

sdasda7777 commented 1 year ago

If you take a look at the N-Quads grammar, you will notice that it clearly says [10] IRIREF ::= '<' ([^#x00-#x20<>"{}|^``\] | UCHAR)* '>', which means <> and <isweird> are both valid IRIREFs. However when I run this code:

from rdflib.graph import Dataset

data="""<> <isweird> <http://true>."""

ds = Dataset()
ds.parse(data=data, format="nquads")

print(ds.serialize(format="json-ld"))

I get this error:

> <isweird> <http://true does not look like a valid URI, trying to serialize this will break.
Traceback (most recent call last):
  File "rdflib\plugins\parsers\nquads.py", line 86, in parse
    self.parseline(bnode_context)
  File "rdflib\plugins\parsers\nquads.py", line 100, in parseline
    predicate = self.predicate()
  File "rdflib\plugins\parsers\ntriples.py", line 278, in predicate
    raise ParseError("Predicate must be uriref")
rdflib.exceptions.ParserError: Predicate must be uriref

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "weird.py", line 6, in <module>
    ds.parse(data=data, format="nquads")
  File "rdflib\graph.py", line 2473, in parse
    c = ConjunctiveGraph.parse(
  File "rdflib\graph.py", line 2251, in parse
    context.parse(source, publicID=publicID, format=format, **args)
  File "rdflib\graph.py", line 1494, in parse
    parser.parse(source, self, **args)
  File "rdflib\plugins\parsers\nquads.py", line 88, in parse
    raise ParseError("Invalid line (%s):\n%r" % (msg, __line))
rdflib.exceptions.ParserError: Invalid line (Predicate must be uriref):
'<> <isweird> <http://true>.'
sdasda7777 commented 1 year ago

Actually, I think that is a contradiction with section 2.2 of the standard, which states "IRIs may be written only as absolute IRIs."

I sent an email to W3 mailing list asking for comment on this, so I won't close this yet, but I don't think this is a valid issue 😅

aucampia commented 1 year ago

You can also try the linkeddata chat on gitter or matrix: https://matrix.to/#/#linkeddata_chat:gitter.im

aucampia commented 1 year ago

Closure was accidental 😅

sdasda7777 commented 1 year ago

Okay, yeah, it seems the content between '<' and '>' still has to comply with RFC3987.